3.2. Sequencing and Metagenomic Assembly
The Illumina HiSeq sequencing platform produced a total of 81,611.46 Mbp of raw data (average data volume was 9,067.94 Mbp). After quality control, 81,245.68 Mbp of clean data were obtained (average data volume was 9,027.30 Mbp), and the effective data rate of the quality control was 99.55% (Table S1). After single sample assembly and mixed assembly, a total of 2,155,261,173 bp scaftigs were obtained with an average length of 1,178.16 bp, maximum length of 514,363 bp, N50 length of 1,230.89 bp, and N90 length of 573.11 bp. Scaffolds were interrupted at N to generate scaftigs, yielding a total of 2,155,261,173 bp scaftigs with an average length of 1,178 bp; 1,231 bp for N50 and 573 bp for N90 (Table S2).
3.3. Microbial Diversity and Community Composition
There were 2,485,201 prediction genes after the original redundancy removal, among which the number of ORFs annotated in the NR database was 1,781,793 (71.70%), and 83.52% of ORFs annotated in NR database reached the boundary level. The proportions were 79.82%, 74.49%, 69.68%, 66.09%, 54.98%, and 44.15% at the phylum, class, order, family, genus, and species levels, respectively.
Among all seawater microorganisms, the community structure of macroalgae-associated bacteria has certain specificity, and the two can be bidirectionally selected through interactions [
28]. Therefore, studying the macroalgae-associated bacteria community is a necessary prerequisite to understanding the relationship between algae and bacteria. At the phylum level, Proteobacteria, Bacteroidetes, and Actinobacteria had a significantly high relative abundance in NJDZ, YNDZ, and WHDZ samples (
Figure 1a,
Figure 2a). However, Proteobacteria, Bacteroidetes, and Actinobacteria had a significantly higher relative abundance in WHDZ samples than in NJDZ and YNDZ samples. Among them, the maximum abundances of Proteobacteria, Bacteroidetes, and Actinobacteria were recorded in WHDZ01, NJDZ01, and WHDZ02, respectively (
Figure 2c).
Proteobacteria is the most dominant seawater phylum [
29]. Bacteroidetes can degrade some biological macromolecules such as chitin and cellulose [
30]. Actinobacteria can degrade organic pollutants and play an important role in marine pollution remediation [
31,
32]. In addition, some phylum with a low abundance were also recorded, such as Cyanobacteria and Firmicutes, among which Cyanobacteria had a high abundance in YNDZ02 and YNDZ03, while Firmicutes had a high abundance in YNDZ01 and YNDZ02. Vieira et al. pointed out that Firmicutes may be related to sea pollution; therefore, sea areas with high Firmicutes abundance may indicate pollution [
32]. Non-metric multidimensional scaling showed that the microbial communities of the NJDZ and WHDZ samples had a similar species composition, and the microbial community of YNDZ samples had a different species composition (
Figure 1b). Environmental and ecological selection may play a role in generating and maintaining microbial diversity in a geographically defined and seemingly unstructured marine ecosystem. Algal microbiomes are complex and dynamic, and their diversity may be driven by ecological or environmental selection to generate and maintain these intimate relationships over space and evolutionary time [
33,
34].
The most abundant microbial class among the nine macroalgae samples also significantly differed according to region. The dominant macroalgae-associated bacteria of NJDZ samples were mostly Gammaproteobacteria, Flavobacteriia, Alphaproteobacteria, and Saprospiria; the dominant macroalgae-associated bacteria of YNDZ samples were mostly Alphaproteobacteria, Gammaproteobacteria, Actinobacteria, Flavobacteriia, Acidimicrobiia, and Bacilli; and the dominant macroalgae-associated bacteria of WHDZ samples were mostly Alphaproteobacteria, Flavobacteriia, Gammaproteobacteria, Actinobacteria, and Acidimicrobiia. It is worth noting that Gammaproteobacteria was the dominant class in NJDZ samples, while Alphaproteobacteria was the dominant class in WHDZ and YNDZ samples (Figure 2b, Figure 2d). These results indicate that the community composition of the macroalgae-associated bacteria is not only related to the environment, but also affected by the host, in accordance with natural selection.
3.4. Functional Prediction of Bacterial Communities
DIAMOND software was used to annotate the database of common functions of non-redundant gene sets (e-value ≤ 10−5); there were 2,485,201 prediction genes after the original redundancy was removed, and 1,493,359 (60.09%) genes could be compared with the KEGG database, among which, 806,765 (32.46%) genes could be compared with 7,566 KEGG ortholog groups. There were 1,428,379 (57.48%) genes that could be compared with the eggNOG database, and 61,807 (2.49%) that could be compared with the CAZy database.
Regarding the community functional level, there were significant differences in the composition and abundance of functional genes of macroalgae-associated bacteria on the surface of red algae among regions. WHDZ samples had a higher genetic diversity regarding all of the functional properties, while YNDZ samples had a lower genetic diversity (
Figure 3). The distribution of KEGG subclasses (level 1) showed a greater number of genes related to various metabolism pathways, followed by genetic information processing pathways (
Figure 3a,
Figure 3b). Macroalgae-associated bacteria communities were significantly enriched for genes associated with amino acid metabolism, carbohydrate metabolism, energy metabolism, metabolism of cofactors and vitamins, membrane transport, nucleotide metabolism, and cellular community-prokaryotes (
Figure 3a). A principal components analysis plot showed that microbial communities in NJDZ, YNDZ, and WHDZ samples differed in terms of functional structure (
Figure 3c).
The annotated genes in the Clusters of Orthologous Genes database are divided into 21 Clusters of Orthologous Genes functional classes. In this analysis, a large number of contigs were classified as “function unknown (S)”, “amino acid transport and metabolism (E)”, “energy production and conversion (C)”, “replication, recombination and repair (L)”, “cell wall/membrane/envelope biogenesis (M)”, and “inorganic ion transport and metabolism (P)” (Figure S1).
3.5. CAZymes Insights of the Bacterial Community
According to the BLASTX results of total contigs against the CAZy database, combined with the hierarchy structure of CAZymes database, the information of each level or level of CAZymes could be obtained. Comparing the various strains of the six families to the CAZy database (
Figure 4a) revealed 2,151,233 carbohydrate active enzyme genes, among which the GH family had the most genes (24,756), followed by the family of glycosyltransferases (GTs) genes (20,694), carbohydrate-binding modules (CBMs) family genes (11291), carbohydrate esterases family genes (3,079), PL family genes (2,564), and auxiliary activities family genes (1,539). The strains with more GH family genes also had more GT and CBM family genes (
Figure 4b), which might be related to the functional correlation of enzymes in each family. The most abundant families predicted in this genome were GH28, GH38, GT2, GT4, CBM13, GT51, GH3, CBM6, CBM50, and GH23 from level 2 of the CAZymes database (
Figure 4c). Further analysis showed that the strains with a high carbohydrate active enzyme gene richness were mainly from Pseudoalteromonadaceae of Proteobacteria and Flavobacteriaceae of Bacteroidetes.
Marine algae are the most promising raw material for the replacement of land plants, and algal oligosaccharides are degraded and utilized by marine heterotrophic microorganisms [
35]. Many enzymes in algae-related microorganisms, such as GH and PL, are involved in this important metabolic process [
6,
35]. The GH family can hydrolyze the glycosidic bonds between two or more carbohydrates or between carbohydrate and non-carbohydrate components, and are essential in the red algae polysaccharide hydrolyzing process [
36]. The GT family is involved in the biosynthesis of disaccharides, oligosaccharides, and polysaccharides by catalyzing the transfer of glycosylates from activated donor molecules to specific receptors to form glycosidic bonds. The most widely distributed GT2 family proteins have the activities of cellulose synthetase, chitin synthetase, hyaluronic acid synthetase, glucan synthetase, and mannan synthetase [
37]. These enzymes are related to the synthesis of algae polysaccharides, indicating that these macroalgae-associated bacteria can not only degrade algae polysaccharides, but also synthesize related polysaccharides themselves.
The main components of red algae are agar and carrageen. Research on carrageenase is the earliest and most extensive; κ-carrageenase belongs to the GH16 family and τ-carrageenase belongs to the GH82 family. β-Agarase belongs to the GH16, GH39, GH50, GH86, and GH118 families, while α-agarase belongs to the GH96 and GH117 families [
38]. β-agarase in the GH16 and GH86 families is an endonuclease, β-agarase in the GH50 family has exonuclease activity, and α-agarase in the GH117 family is an exonuclease [
39]. Therefore, when macroalgae-associated bacteria communities contain a variety of GH family enzyme coding genes, there is an opportunity for the emergence of endo-cut β-agarase, exo-cut β-agarase, and exo-cut α-agarase systems, which eventually degrade polysaccharides into oligosaccharides.
Annotation of the functional genes of the macroalgae-associated bacteria genome revealed that the bacteria genome isolated from the surface of red algae contained a large number of genes related to the degradation of alginate. This indicated that most of the strains on the surface of algae had the basic function of degrading alginate, which was related to their living environment and their own carbon source utilization function [
40,
41]. Therefore, the diversity of these functional genes was further analyzed, revealing the presence of the PL7 family in strains distributed in Proteobacteria and Bacteroidetes. These results indicate that the microorganisms on the surface of algae are rich in alginate lyase, and that algae degradation requires the coordination of enzymes with multiple functions. These enzymes work together to utilize algae polysaccharides.
3.6. Mining of Enzymes for Algal Polysaccharide Degradation
Increasing attention has been paid to marine oligosaccharides due to their bioactivity, solubility, and bioavailability. Studies have shown that algae oligosaccharides have many potential applications [
42]. The in-depth functional analysis of nine macroalgae metagenomic data was performed to identify new algal polysaccharide-degrading enzymes for biotechnological purposes (
Table 2 and
Figure 5). Metagenomic data were then further analyzed to explore the number of algal polysaccharide-degrading enzyme genes. There were significant differences in algal polysaccharide-degrading enzymes on the surface of red algae among the remote regions. The number of alginate lyase, agarose, and carrageenase genes on the surface of red algae among the regions is shown in Table 3. The number of agarase genes from WHDZ was 1,076, and that from NJDZ was 939, while YNDZ had only 41. The number of carrageenase genes from WHDZ was 940, and that from NJDZ was 759, with YNDZ only having 33. The number of alginate lyase genes from WHDZ was 840, and that from NJDZ was 829, while YNDZ had only one. The agarase gene was mainly distributed in WHDZ03, followed by NJDZ01. There were very few agarase genes in YNDZ, with only three alginate lyase genes in YNDZ01 and 38 in YNDZ03. The carrageenase gene was mainly distributed in WHDZ03, followed by NJDZ01. Similarly, there were very few carrageenase genes in YNDZ, with only three carrageenase lyase genes in YNDZ01 and 30 in YNDZ03. The alginate lyase gene was mainly distributed in NJDZ01, followed by WHDZ03. There was also very few alginate lyase genes in YNDZ, with only one in YNDZ03.
These results suggest that WHDZ and NJDZ had a high abundance of algal polysaccharide-degrading enzymes, while YNDZ had an extremely low abundance. There is evidence that the characteristics of the algae itself, secretions at different growth stages (small organic molecules such as early amino acids and organic acids, and large organic molecules such as algal polysaccharides and lipids), and environmental conditions for algal growth (such as pH, water flow, light, temperature, and nutrients) affect the composition of the macroalgae-associated bacterial community, and thus enzyme abundance [
44,
45,
46]. Of note, there were significant differences in the algal polysaccharide-degrading enzymes on the surface of red algae among the remote regions, especially in the NJDZ samples. The environmental differences between NJDZ
, WHDZ, and YNDZ may impose a strong geographic differentiation in the biodiversity of algal microbiomes and their expressed enzyme genes.
Numerous macroalgae-associated marine microorganisms contain undeveloped enzymes such as GH and PL, which are involved in important metabolic pathways and whose characterization has the potential to provide new insights into the marine carbon cycle [
24]. This study effectively extends our understanding of the diversity of algae surface microorganisms and their polysaccharide-degrading enzymes. It also deepens our understanding of how microorganisms process the hundreds of millions of tons of polysaccharides produced by algae each year and their role in the marine carbon cycling system.
3.7. Environmental Factor Analysis
Canonical correspondence analysis (CCA) and redundancy analysis (RDA) analyses were performed using R software (Version 2.15.3), and relationships between bacterial communities and environmental variables were constructed at the genus level. As shown in
Figure 6a and
Figure 6b, the first and second spindles of the CCA accounted for 56.85% and 43.15% of the variance of the relative abundance of the bacterial community, respectively, and the first and second spindles of RDA accounted for 53.46% and 46.54%, respectively. This indicated that the relationship between the bacterial community and environmental variables was reliable. The CCA and RDA analyses of environmental factors showed that, without considering the species and composition of algae, pH and temperature were the main environmental factors affecting bacterial community structure. Most previous studies have compared native species diversity (alpha diversity) at different latitudes [
46]. However, recent studies have shown that temperature is also one of the most important variables explaining the differences in local community species composition over large-scale latitudinal gradients [
34,
47].
Salinity, temperature, pH, and DO are the main environmental factors affecting macroalgae-associated bacteria communities. They not only affect the community structure and distribution of bacteria, but also can be used to indicate the environmental status of marine ecosystems [
44,
45]. In order to further explore the influence of environmental factors on the macroalgae-associated bacteria communities, a correlation heat map analysis was conducted, and the results are shown in
Figure 7.
Limnothrix and
Leptolyngbya were positively correlated with temperature and pH, but negatively correlated with DO.
Psychroflexus and
Psychromonas were positively correlated with DO, but negatively correlated with temperature and pH. In addition,
Maribacter,
Nitratireductors, and
Robiginitomaculum were negatively correlated with salinity, and
Acinetobacter was positively correlated with salinity. This showed that environmental factors had varying effects on different macroalgae-associated bacteria.
3.8. Diverse Bacterial Linages Potentially Harbor Antibiotic Resistance Genes
This study further screened the diversity and abundance of the antibiotic resistance genes (ARGs) in the microbial gene catalog. The ARG-like genes were assigned to 20 ARG classes, with high proportions being unclassified (Figure S2a). The total numbers of ARG classes in YNZD02 were surprisingly higher than in other samples, suggesting that human activities may also lead to ARG enrichment [
48,
49]. YNZD had a higher relative abundance of ARG classes compared with NJDZ and YNDZ (Figure S2b). These findings suggest that macroalgae are a neglected but potentially important reservoir of ARGs. The main ARG classes detected in YNZD02 were
TEM-171 and
tetG (tetracycline resistance) genes (Figure S2a). These findings suggest that macroalgae in Halmahera, Indonesia are possibly contaminated by anthropogenic ARGs to a certain extent. The diversity of ARG hosts with multiple antibiotic potential resistance suggests that these bacteria are ARG hosts and may play a key role in the acquisition and spread of antibiotic resistance in macroalgae.
ARGs are present in almost all environments. They are either endemic to the natural environment or derived from human-dominated ecosystems [
50,
51]. However, the diversity and hosts of ARGs in macroalgae remain unclear. Their abundance of microbes indicates that macroalgae could be a potential reservoir for ARGs. If these microbes that are resistant to various antibiotics are algal pathogens, this could worsen an outbreak of macroalgal disease. In addition, macroalgal ecosystems may be contaminated by antibiotics and ARGs from human and agricultural wastes [
52]. Residues of antibiotics in aquatic systems can be deposited in macroalgae, and may eventually threaten their growth and affect microbial ecology [
53]. Therefore, studying ARGs is an important component of evaluating macroalgae ecosystem health.