1. Introduction
Species identification is crucial in field of biology and ecology, holding widespread importance [
1]. It serves as the foundation for ecological research, enabling the understanding of species richness, diversity, and ecosystem health [
2]. Additionally, it aids in identifying endangered, invasive, and keystone species within ecosystems, facilitating effective conservation and management strategies [
3]. Moreover, species identification plays a crucial role in predicting and preventing infectious disease outbreaks by identifying potential disease hosts and transmitters among wild animal species [
4]. In food production industries, species identification ensures authenticity, quality, and safety, preventing fraud and the circulation of substandard products [
5]. Furthermore, it has relevance in criminal and forensic cases, aiding in identifying the origin of wildlife products. [
6]. Traditional methods of species identification, relying on morphological characteristics, have limitations in discriminating taxa with minimal morphological differences or complex phylogeny. To overcome these challenges, DNA barcoding technology has emerged as an effective advancement.
DNA barcoding is a molecular biology technique for the identification of biological species by examining distinct DNA segments. It uses variations in short DNA sequences to provide rapid and reliable species identification [
5,
7,
8,
9,
10,
11]. DNA barcoding enables the analysis of specific gene regions, aiding in the identification and differentiation of morphologically similar species [
5,
10]. The concept of DNA barcoding was first proposed by Paul Hebert, who suggested using a small, highly conserved genetic sequence called the “ribosomal RNA gene region” to identify species [
5]. Initially, DNA barcoding was widely used in animal, where the gene encoding cytochrome c oxidase I (COI) in mitochondria has a high species differentiation potential, especially in insects, birds and fish [
12,
13]. Therefore, the COI gene has become the preferred choice for universal DNA barcoding in animals due to its high level of accuracy in species identification [
14]. However, in plant mitochondrial genomes, the COI gene shows a high degree of conservation and is not suitable as a DNA barcode selection [
15]. In addition, complex evolutionary events such as hybridization, polyploidization, and lineage selection are more common in plants than in animals, further increasing the difficulty of screening fragments suitable for DNA barcoding [
16]. Unlike the universal COI gene fragments in animals, DNA barcoding research in plants has undergone a screening process of a large number of fragments [
17]. Currently, the internationally recognized universal plant DNA barcodes include four gene regions including ITS (internal transcribed spacer: internal transcribed spacer 1-5.8S-internal transcribed spacer 2),
matK,
rbcL and
psbA
-trnH [
18]. The selection of these gene regions takes into account the genetic diversity and evolutionary history of the plant kingdom to improve the identification ability and applicability of plant DNA barcodes. But these fragments also have limitations, so Kane and Cronk proposed ultra-barcoding, which uses the complete plastomes for plant species identification [
19]. DNA barcoding has significant success in plant species identification and classification and has provided a common standard for the international botanical community [
20]. It has been widely used in diverse biological areas such as unveiling hidden species, identifying invasive ones, and elucidating food networks [
21]. Moreover, it serves as a reliable method for verifying herbal medicinal products, detecting instances of product substitution, and contamination [
22,
23,
24,
25]. Distressingly, it’s not uncommon to find herbs that appear similar being used as adulterants in the commercial herbal arena. Although discerning closely related species using DNA barcoding can pose challenges, the technique excels in distinguishing between species that are morphologically indistinguishable but genetically distinct [
26]. In conclusion, DNA barcoding is a valuable tool in biological research, enabling rapid and reliable species identification and classification in diverse organisms.
Amomum Roxb. is the second-largest genus in the Zingiberaceae family after
Alpinia, which includes approximately 111 [
27] to 150 [
28,
29] species distributed in tropical Asia and Australia, particularly in Southeast Asia, such as India, Malaysia and Indonesia [
29]. In China,
Amomum comprises 39 species (29 endemic, one introduced) [
29], mainly distributed across provinces like Fujian, Guangdong, Guangxi, Guizhou, Yunnan and Tibet [
28]. Among them, six species have been listed in the Chinese Pharmacopoeia [
30]. These species encompass (1)
A. compactum Solander ex Maton (synonyms:
Wurfbainia compacta (Sol. ex Maton) Škorničk. & A.D.Poulsen [
31]), (2)
A. kravanh Pierre ex Gagnep. (synonyms:
A. krervanh Pierre ex Gagnep [
32]
, W. vera (Blackw.) Škorničk. & A.D.Poulsen and
A. verum Blackw. [
32,
33]), (3)
A. longiligulare T. L. Wu (synonyms:
W. longiligularis (T.L.Wu) Škorničk. & A.D.Poulsen [
34]), (4)
A. tsao-ko Crevost et Lemarie(synonym:
Lanxangia tsao-ko (Crevost & Lemarié) M.F.Newman & Škorničk [
35]), (5)
A. villosum Lour. (synonyms:
W. villosa (Lour.) Škorničk. & A.D.Poulsen [
36]) and (6)
A. villosum var.
xanthioides (Wall.ex Bak.) T.L.Wu & S.J.Chen (synonyms:
W. villosa var. xanthioides (Wall. ex Baker) Škorničk. & A.D.Poulsen [
37]). They exhibit a diverse range of characteristics and applications. For instance,
A. compactum is a widely used culinary spice, and its fruits, leaves and seeds have a wide range of pharmacological activities in traditional medicine, such as antifungal, antibacterial, antioxidant, gastroprotective, anti-inflammatory, immunomodulatory, anticancer, antiasthmatic and acute renal failure [
38]. Fruits of
A. kravanh have showed antibacterial activity [
39]. The active ingredients in
A.
longiligulare and
A. villosum var.
xanthioides have antibacterial activity [
40,
41]. Besides, powerful antioxidant properties of
A. villosum var.
xanthioides in the treatment of non-alcoholic fatty liver disease (NAFLD) and non-alcoholic steatohepatitis (NASH) [
42].
A. tsao-ko has been found to contain antifungal active substances [
43] and antioxidant ingredients [
44], indicating its potential medicinal properties; recent research also suggests that it has the ability to relieve constipation and could be a promising candidate for developing laxatives in the future [
45]. The total flavonoids extracted from
A. villosum have shown promising potential for developing new drugs to treat gastric cancer [
46]. Chemical components found in the seeds of
A. villosum can enhance cellular antioxidant activity, as reported by [
47]. Additionally, Chen et al. (2018) have confirmed the potential beneficial effects of
A. villosum in the treatment of inflammatory bowel disease [
48]. Moreover, Li et al. (2016) have demonstrated that the fresh stems and leaves of
A. villosum can be used as high-quality feed for cattle, sheep, and other grass-eating livestock [
49]. However, their morphological similarities make it easy to confuse these species with one another, and they are also prone to being replaced by other species within the same genus. Therefore, employing molecular identification through DNA barcoding is crucial for accurately identifying
Amomum six species.
The ITS sequence is approximately 500-700 bp long. It exhibits a high degree of conservation, making them applicable across a wide spectrum of biological species, particularly in the case of plants and fungi. The sequencing and analysis of ITS are often characterized by their rapidity and cost-effectiveness, especially when compared to traditional morphological classification methods. Additionally, an extensive repository of ITS is available in public databases, providing researchers with a wealth of reference resources that facilitate expedited species identification and classification. The GenBank database at the National Center for Biotechnology Information (NCBI) hosts an extensive collection of ITS sequences for
Amomum and its synonymous plants. As of April 11, 2024, it includes 572 sequences that represent 159 species. This extensive dataset serves as a valuable resource for our DNA barcoding research, providing comprehensive and diverse information. In the identification of medicinal plants and distinguishing them from counterfeits, Selvaraj et al. (2012) found that ITS and specifically ITS1 (internal transcribed spacer 1) are effective DNA barcodes for
Boerhavia diffusa Linnaeus [
50]. The ITS2 (internal transcribed spacer 2) region has been utilized for the identification of medicinal plants and their closely related species [
51], such as within the Polygonaceae A. L. Jussieu family [
52] and the
Dendrobium Sw. genus [
53]. The ITS2 region has been demonstrated to be the most promising universal DNA barcode for Zingiberaceae Martinov family [
54]. The super-barcode complete plastomes, as well as the
matK and
rbcL genes, can effectively distinguish
A. compactum,
A. longiligulare, and
A. villosum [
55,
56]. Additionally, the
matK gene and the
psbA-
trnH intergenic spacer exhibit high identification efficiency for
A. tsao-ko and other
Amomum species [
57]. Among them, the barcodes that are more effective for molecular identification of
Amomum are ITS [
57,
58,
59], ITS1 [
60,
61], and ITS2 [
61,
62,
63]. These research findings demonstrate the promising potential application of DNA barcoding technology in species identification and classification within
Amomum. By using DNA barcoding, researchers can accurately identify and classify different
Amomum species, which helps us understand their diversity and evolutionary relationships, and provides effective tools and methods for the protection, sustainable utilization and medicinal value research of
Amomum.
In this study, we employed a combination of newly sequenced data and sequences obtained from the NCBI database, including (1) ITS, (2) ITS1, (3) ITS2, (4) complete plastomes, (5) matK, (6) rbcL, and (7) psbA-trnH, to facilitate the calibration and precise identification of six medicinal plants within the Amomum genus. By utilizing DNA barcode technology, we were able to identify different Amomum medicinal species at the molecular level, thereby reducing the potential errors associated with traditional morphological methods. Our findings have the potential to enhance the sustainable utilization and conservation of Amomum resources, facilitate industry development and quality control, and ultimately provide significant scientific and societal benefits.
5. Conclusions
We examined plastome structural variations and investigated the efficacy of standard and super DNA barcodes for resolving species boundaries based on within and between species variation within six medicinal Amomum plants. In this study, six medicinal plants of the genus Amomum were molecularly identified using the ITS, ITS1, ITS2, complete plastomes, matK, rbcL, and psbA-trnH sequences. Among these seven sequences, ITS, ITS1 and complete plastomes were effective in identifying A. compactum, A. kravanh, and A. tsao-ko, while ITS2, matK, and psbA-trnH only can successfully identify A. tsao-ko. In contrast, rbcL failed to identify any species. In summary, ITS, ITS1 and complete plastomes demonstrates the highest identification rate, followed by ITS2, matK, and psbA-trnH, with rbcL having the lowest identification rates. In conclusion, considering factors such as cost, for the molecular identification of the six medicinal plants within the Amomum genus, the use of ITS1 is strongly recommended. This study developed reliable molecular identification methods for the genus Amomum, crucial for protecting wild plant resources, rational use of medicinal plants, and preventing resource misuse. In summary, it provided essential molecular tools for species identification and classification, enhancing our understanding of Amomum medicinal plants.
Supplementary Material: The following supporting information can be downloaded at the website of this paper posted on Preprints.org. The Supplementary Material for this article as follows: Tables: Supplementary Table 1. Summary of significant characteristics of six medicinal Amomum plants plastomes, including aspects of genome size, G-C content, and gene number. Supplementary Table 2. Net differences between minimum interspecific and maximum intraspecific distances for six medicinal plants in the genus Amomum across seven datasets, derived from barcoding gap analysis. Supplementary Table 3. The number of putative species recognized by automatic barcode gap discovery (ABGD) analyses of seven datasets using three distance metrics. Supplementary Table 4. All samples of Amomum and their synonyms used in this study (those marked with “*” are individuals sequenced by ourselves, others are downloaded from NCBI). Figures: Supplementary Figure 1. Phylogenetic tree was reconstructed based on the Maximum likelihood (ML) method with the ITS set of all individuals of Amomum and its synonyms. The numbers at nodes indicate ML bootstrap values (BS). Supplementary Figure 2. Phylogenetic tree was reconstructed based on the Bayesian Inference (BI) method with the ITS set of all individuals of Amomum and its synonyms. The numbers at nodes indicate BI posterior probabilities (PP). Supplementary Figure 3. Phylogenetic tree was reconstructed based on the Maximum likelihood (ML) method with the matK set of all individuals of Amomum and its synonyms. The numbers at nodes indicate ML bootstrap values (BS). Supplementary Figure 4. Phylogenetic tree was reconstructed based on the Bayesian Inference (BI) method with the matK set of all individuals of Amomum and its synonyms. The numbers at nodes indicate BI posterior probabilities (PP). Supplementary Figure 5. Phylogenetic tree was reconstructed based on the Maximum likelihood (ML) method with the rbcL set of all individuals of Amomum and its synonyms. The numbers at nodes indicate ML bootstrap values (BS). Supplementary Figure 6. Phylogenetic tree was reconstructed based on the Bayesian Inference (BI) method with the rbcL set of all individuals of Amomum and its synonyms. The numbers at nodes indicate BI posterior probabilities (PP). Supplementary Figure 7. Phylogenetic tree was reconstructed based on the Bayesian Inference (BI) method with the ITS set of selected individuals of Amomum and its synonyms. The numbers at nodes indicate BI posterior probabilities (PP). Supplementary Figure 8. Phylogenetic tree was reconstructed based on the Maximum likelihood (ML) method with the ITS2 set of selected individuals of Amomum and its synonyms. The numbers at nodes indicate ML bootstrap values (BS). Supplementary Figure 9. Phylogenetic tree was reconstructed based on the Bayesian Inference (BI) method with the ITS2 set of selected individuals of Amomum and its synonyms. The numbers at nodes indicate BI posterior probabilities (PP). Supplementary Figure 10. Phylogenetic tree was reconstructed based on the Bayesian Inference (BI) method with the complete plastomes set of selected individuals of Amomum and its synonyms. The numbers at nodes indicate BI posterior probabilities (PP). Supplementary Figure 11. Phylogenetic tree was reconstructed based on the Maximum likelihood (ML) method with the matK set of selected individuals of Amomum and its synonyms. The numbers at nodes indicate ML bootstrap values (BS). Supplementary Figure 12. Phylogenetic tree was reconstructed based on the Bayesian Inference (BI) method with the matK set of selected individuals of Amomum and its synonyms. The numbers at nodes indicate BI posterior probabilities (PP). Supplementary Figure 13. Phylogenetic tree was reconstructed based on the Maximum likelihood (ML) method with the psbA-trnH set of selected individuals of Amomum and its synonyms. The numbers at nodes indicate ML bootstrap values (BS). Supplementary Figure 14. Phylogenetic tree was reconstructed based on the Bayesian Inference (BI) method with the psbA-trnH set of selected individuals of Amomum and its synonyms. The numbers at nodes indicate BI posterior probabilities (PP). Supplementary Figure 15. Phylogenetic tree was reconstructed based on the Maximum likelihood (ML) method with the rbcL set of selected individuals of Amomum and its synonyms. The numbers at nodes indicate ML bootstrap values (BS). Supplementary Figure 16. Phylogenetic tree was reconstructed based on the Bayesian Inference (BI) method with the rbcL set of selected individuals of Amomum and its synonyms. The numbers at nodes indicate BI posterior probabilities (PP). Supplementary Figure 17. Phylogenetic tree was reconstructed based on the Bayesian Inference (BI) method with the ITS1 set of selected individuals of Amomum and its synonyms. The numbers at nodes indicate BI posterior probabilities (PP).