1. Introduction
The field of algal ecology is still largely dominated by classical approaches, such as microscopic identification of algae including abundance measurements, determination of chlorophyll A concentrations and other group-specific compounds by spectrometry, establishment of cultures for further analysis, etc. [
1,
2,
3]. In contrast, molecular methods (metabarcoding, metagenomics, meta-transcriptomics) are widely used in other fields of microbial ecology (e.g., study of fungi and bacteria). Only barcoding is used to some extent in algal ecology, mostly to identify algae in culture at the molecular level.
In the light microscopic approach, the taxa of interest are typically identified by morphological characteristics such as colour, cell size and shape, or motility [
4,
5,
6,
7]. The use of light microscopy for identification is relatively fast and inexpensive compared to molecular techniques [
8], but requires expert knowledge for many taxa, as morphological features are often difficult to recognise and distinguish [
9].
Morphological characteristics are not always stable, as they can change in response to environmental factors [
10]. Alternatively, a set of molecular sequence markers, known as barcodes, can be used when unialgal cultures are available [
11]. Typical barcode sequences used for algae are the 16S/18S rRNA gene and internal transcribed spacer (ITS) rDNA, RuBisCO large subunit (rbcL), plastid elongation factor tufA, cytochrome oxidase I (COX I) [
12,
13,
14,
15,
16,
17,
18]. A major drawback of this approach is that the majority of microorganisms present in environmental samples are unculturable [
19,
20,
21,
22]. To overcome the limitations of cultivation, metabarcoding can be used to assess biodiversity [
23,
24,
25]. Total DNA (eDNA) is extracted from an environmental sample and used as a template to generate an amplicon mixture from a barcoding gene [
23]. The resulting PCR products are then sequenced using high-throughput sequencing (HTS) technology [
23,
26]. Taxa are identified by annotating the resulting sequence reads against an appropriate database, and sequence counts provide information on taxonomic abundance in the sample [
23,
27]. However, (meta)barcoding also has several pitfalls, such as the introduction of sequence errors during PCR, the design of appropriate metabarcoding primers that cover all taxa of interest, and again the need for appropriate reference databases [
23].
PCR-dependent bias and reliance on single barcodes can be avoided by using shotgun metagenomics and metatranscriptomics [
28,
29]. Similar to metabarcoding, total nucleic acids are isolated from an environmental sample, but the amplification step is omitted [
28,
29]. Instead, total DNA or cDNA is applied directly to HTS and the resulting sequences are assembled for any gene or transcript of interest, such as small ribosomal subunit RNA [
30]. This powerful approach allows more reliable taxonomic identification than metabarcoding [
31], but is dependent on the availability of correctly determined sequence data [
28,
32].
While working on biological soil crusts in polar regions, we identified a significant potential of metagenomics and metatranscriptomics for algal ecology e.g., [
33,
34,
35,
36]. We were able to show that metabarcoding revealed higher biodiversity than the traditional light microscopy approach [
34]. Furthermore, in a recent paper we showed that metagenomics is a more effective approach than metabarcoding for studying algal biodiversity in natural habitats [
31]. In this study, we directly compare light microscopic and metagenomic methods to study the biodiversity of freshwater ponds in the Eifel National Park, Germany.
The Eifel National Park was established in 2004 by the German state of North Rhine-Westphalia. The park covers an area of approximately 110 square kilometres in the south-west of North Rhine-Westphalia. The aim of the park is to allow nature to develop mainly naturally, and it consists of areas that are completely closed to the public and managed areas that allow various types of activities. The Eifel National Park was created with the aim of playing a major role in the protection of flora and fauna in North Rhine-Westphalia. In order to achieve this goal, it is necessary to make an inventory of all organisms living in the national park (for an up-to-date summary, see [
37]). While this is a relatively simple task for many macrophytes and metazoans, the detection and identification of microbial diversity is still time consuming and difficult. To this end, algal biodiversity in the Eifel National Park has been monitored using the classical approach by one of us (Linne von Berg) almost since its inception, and this work has been documented in annual reports. A total of 926 algal species, including 66 cyanobacteria, have been documented [
38].
In this report, we attempt to determine the microalgal (focus on diatoms and zygnematophytes) by light microscopy and microbial diversity using a metagenomic approach of the 14 (5 by metagenomics) artificial water bodies located mainly in the closed area of the national park. All water bodies are man-made (fishing ponds), often constructed many years before the establishment of the national park, and were maintained when the national park was established. In a few cases, additional ponds were added after the establishment of the National Park. We show that although molecular methods offer considerable advantages, there were still a number of algal species that we only observed using classical methods.
4. Discussion
Microalgal Biodiversity of Eifel Ponds
The biodiversity of the microalgae in the Eifel National Park has been monitored almost since its establishment. By December 2020, 926 different algae, including 66 species of cyanobacteria, had been recorded [
38]. Of these, 2016 are rare species included in red lists [
38]. In this study, we provide molecular evidence for the presence of a further 157 previously unreported genera. This represents a huge increase in algal diversity. The real additional molecular diversity is likely to be much higher, as we only investigated 5 ponds using metagenomics, and due to the still incomplete databases, which only allow us to detect genera using the Silva database, the genera identified by molecular methods may represent many species (see for example in
Supplementary Table S2 the large number of Desmid species identified by light microscopy representing a single genus in the ponds investigated). One of the most striking differences between the two methods in this study is the large number of rRNA reads found for the dinoflagellate J
adwigia spec.
Jadwigia was only found in a single pond (SU1) and only by metagenomics.
Jadwigia was only described in 2005 [
51,
52] and has never been observed in a pond in the Eifel National Park. As we also sampled sediments from the investigated ponds for the metagenomic approach, we think that a large number of hypnozygotes of
Jadwigia could have been present in the sediment. Future studies are needed to support this explanation.
Differences between the Two Approaches
In this report we determined the micro algal biodiversity of 14 ponds using the classical approach (identification of species by light microscopy [
4,
5,
6,
7]. Based on the light microscopical results 5 ponds representing the light microscopically observed diversity were selected and analysed by a molecular approach, which has been suggested to be superior to the classical approach [
23,
24,
25]. Initially most molecular studies were done using the metabarcoding approach, however more recently metagenomics has started to replace metabarcoding. We choose metagenomic sequencing of environmental DNA as an approach as this has recently shown by us to be preferential to metabarcoding [
31]. Metagenomics allows a deeper investigation (detection of rare species) of the biodiversity and is not impeded by PCR biases as the metabarcoding approach. Similar to earlier results [
34] we found a larger number of microalgal genera using the molecular method (207) than using the classical approach (81). In addition the metagenomics (mapping reads to sequence databases) always gives quantitative numbers that can be used for quantitative analyses [
23,
27], while it is still very difficult to infer the abundance of microalgae using the classical methods, either employing Utermöhl sedimentation chambers (fixed material, [
53]), or more often artificial numbers to record the abundance of the microalgae observed. Another major advantage of the molecular method is that it allows the detection and analysis of the complete biodiversity of the investigated ponds. Light microscopical identification is based on expert training and there will be nobody worldwide who can identify all eukaryotic groups at the same time.
However it is important to note that molecular methods still have some problems on their own. Most strikingly 31 genera were only found using the light microscopy method, suggesting that the molecular method still introduces some artificial bias, either at the DNA isolation or sequencing stage of the investigation. Another drawback is that while microscopic observations often allow us to identify the microalgae at the species level, molecular methods generally allow only to assign algae at the genus level. Incomplete and wrongly annotated reference data are a major problem when performing molecular analyses of any kind and impair the correct determination of genera in environmental samples [
23,
32]. The microbial dark matter, meaning the total of unculturable microorganisms, complicates this matter even more, as classical methods fail to add sequence information to the databases [
54]. However, the developing omics techniques can overcome the limit of culturability and supply the databases with novel sequences [
54].
Molecular and light microscopical methods gave similar results regarding the ecology of the investigated ponds. Water depth, pH and nitrite were correlating with the observed biodiversity differences. Again the molecular method identified two additional factors correlating with observed differences: conductivity and carbon hardness. The reason for this better “resolution” of the discriminating factors might be twofold: 1. Greater alpha diversity and 2. better abundance numbers. The read number determined for the rRNA of the different species is directly correlated to the cell number of an alga. It might be different for different algae dependent on rRNA gene numbers and genome size, leading to different percentages of the genome coding for rRNA. However they are constant for each species and allow a direct comparison of the numbers for the different algae [
23,
27]