1. Introduction
Bananas and plantains (Musaceae) are grown in all tropical and subtropical regions worldwide. They are the fourth largest food crop on the global market, after rice, wheat, and maize [
1].
Most commercial banana cultivars originated from crosses between the wild subspecies
Musa acuminata Colla (2n = 2x = 22; genome A) and
Musa balbisiana Colla (2n = 2x = 22; genome B), which produced a series of diploid, triploid, and tetraploid bananas. The genomic groups resulting from these crosses are classified as AA, AB, AAA, AAB, ABB, AABB, AAAB, and ABBB [
2]. The genome sequence of
M. acuminata ssp.
malaccensis, derived from a double haploid Pahang accession, represents the A genome (n = 11) [
3,
4], whereas that of
M. balbisiana, derived from a Pisang Klutuk Wulung accession, represents the B genome (x = 11) [
5,
6].
The A genome is mostly related to improved production, yield, and fruit quality attributes, whereas the B genome lends robustness and tolerance/resistance to abiotic and biotic stresses [
3,
5,
6]. The “B” genome is associated with the
banana streak virus (
BSV) [
7], which influences the exchange of accessions between germplasm banks, field management, and in vitro cultivation. The virus has two forms of endogenous sequences (eBSV) in the “B” genome [
8,
9]: i) incomplete sequences that are considered evolutionary relics from previous infections and do not cause the disease, and ii) complete sequences that are initially dormant and activated to promote pathogenesis when the plant is challenged by biotic/abiotic stresses [
10,
11].
The genomic composition of banana is unpredictable, even in controlled crosses, owing to "unbalanced meiosis" and homologous recombination between “A” and “B” genomes, leading to a different number of sets or segments of each parent genome [
12,
13,
14].
Ploidy is determined in banana using several methods, including morphological markers [
15]. However, morphological markers are sensitive to environmental factors and are imprecise and impractical to measure at a large scale [
13,
16]. The use of molecular markers to distinguish the doses of “A” and “B” genomes in
Musa spp. has been evaluated [
13,
15,
17], however, despite their advantages over morphological markers, molecular markers are vulnerable to co-amplification with fungal DNA, if present, leading to misidentification, multiple copies, and ultimately, low accuracy.
Breeding programs seek effective and long-lasting techniques to improve crop characteristics, but are limited by the complex inheritance of most agronomic traits and strong genotype–environment interaction [
18]. Recently, the CRISPR/Cas9 system has been widely used to induce specific genome mutations in several plant species, which has greatly contributed to the study of gene function in crop genetic improvement programs. This technique facilitates gene editing by cutting and replacing or adding sequences to the DNA of a given genotype [
19]. To validate the use of CRISPR/Cas9 for tolerance to biotic and abiotic stresses in banana, the literature proposes initially using the knockout of the PDS (Phytoene desaturase) gene as a proof of concept [
20,
21,
22].
The PDS gene has been widely used as a molecular marker for genome editing in several plant species, including bananas [
20,
21]. This gene plays a fundamental role in the carotenoid biosynthesis pathway, as it is highly conserved and has similar catalytic properties. PDS is a key enzyme in the carotenoid biosynthesis pathway, catalyzing the desaturation of phytoene (a transparent compound) into ζ-carotene, which is subsequently converted into lycopene, a colored compound [
23]. PDS knockout affects photosynthesis, gibberellin production, and carotenoid biosynthesis, which leads to dwarfism and albinism in plants [
24,
25,
26], suggesting that PDS can be a selective marker for the development of genetic engineering products.
The objective of this study was to develop a marker from the PDS gene capable of differentiating the A genome (M. acuminata) from the B genome (M. balbisiana) in banana. To validate its potential, the gene marker was tested on 150 banana accessions with different ploidy types collected from the Embrapa Mandioca e Fruticultura Germplasm Bank. This is the first report of a PDS gene-derived molecular marker that can identify “A” and “B” genomes in banana with 99.33% and 100% accuracy, respectively. Our study provides a foundation for the preliminary characterization of the genomic composition of banana accessions to predict agronomic, sensory, and resistance/tolerance characteristics that are desirable for genetic improvement programs.
3. Results and Discussion
The complete PDS gene sequences of
M. acuminata (AA) and
M. balbisiana (BB) were downloaded from the Banana Genome Hub (
https://banana-genome-hub.southgreen.fr/) on the SouthGreen platform. The PDS gene (Ma08_g16510) of
M. acuminata has 27 944 bp and 14 exons, and the PDS gene (Mba08_g16040.1) of
M. balbisiana has 21 262 bp and 11 exons. The alignment of these sequences shared 96.40% nucleotide homology from the start to stop codons.
The PDS gene has often been used as a concept marker/proof in CRISPR/Cas9 gene editing experiments in many plant species such as maize [
32],
Arabdopsis [
33], tomato [
34], rice [
35], and banana [
26].
After downloading the material to construct the PDSMa- and PDSMb-specific markers, the coding regions of the PDS gene were aligned using Clustal Omega (
https://www.ebi.ac.uk/Tools/msa/clustalo/) to identify discriminatory/polymorphic regions between the “A” and “B”genomes (
Figure 1).
The specific primers for PDSMa and PDSMb were evaluated by PCR. Two primers were used as controls—
β-tubulin, as an endogenous gene, and the primer developed by Ntui et al. [
26] to amplify the PDS gene in both the “A” and “B” genomes (PDS_AB) (
Figure 2A). The amplification of β-tubulin, PDS_AB [
26], and PDSMa in 12 banana samples with representative genomes of different ploidy types (Germplasm Bank of Embrapa Mandioca e Fruticultura) is shown in
Figure 2A.
The cultivars Balbisiana Franca (BB), Butuhan (BB),
Musa balbisiana (BB), BB Franca (BB), and Teparod (ABBB) showed band amplification only for the PDS_AB and
β-tubulin primers, confirming the discriminatory power of the PDSMa primer (
Figure 2B).
The PDSMa primer has 476 bp and was constructed without intron regions. The amplification of this primer in all “A” genome
Musa accessions produced a 2166 bp fragment (
Figure 2B) based on the regression model y = –0.003x + 9.5999 (R2 = 0.90). This band size reflects our use of total genomic DNA, which contains introns. The PDS_AB and
β-tubulin primers produced fragments of 994 and 110 bp, corroborating the PDS gene sizes reported by Ntui et al. [
26] and Podevin et al. [
30], respectively.
The amplification of the PDSMb primer in six banana samples with different ploidy types is shown in
Figure 3. A fragment of approximately 332 bp was observed in all samples with A and B ploidy (
Figure 3), which was based on the regression model y = –0.0123 + 13.989 (R2 = 0.96). In addition to the ~322 bp band, the cultivars Zebrina (AA), Gros Michel (AAA), and Bucaneiro (AAAA) presented the amplification of a second specific band of ~225 bp (based on the same regression model), indicating that this band pattern only occurred in specimens with 100% A ploidy (
Figure 3).
Table 2 shows the banana accessions from the Embrapa Mandioca e Fruticultura Germplasm Bank and their respective PCR amplification results with the four primers. The 150 genotypes evaluated were represented by different ploidy types: AA, AAA, AAAA, BB, AB, AAB, ABB, AAAB, AABB, and ABBB.
Table 2 shows cultivars with bands amplified using the primers
β-tubulin, PDS_AB, PDSMa, and PDSMb. All genotypes were positive for the primers
β-tubulin and PDS_AB. For the PDSMa primer, amplification did not occur in genotypes with >75% of genome “B”, and all 100% “A” ploidy genotypes had amplified bands of ~225 bp in PDSMb.
Banana genetic improvement is based on crossing wild or improved diploids with commercial cultivars to generate hybrids resistant/tolerant to biotic and abiotic stresses and with agronomic characteristics consistent with market demands [
36,
37]. Nwakanma et al. [
15], suggested that early determination of the banana genome composition can aid breeders in predicting the occurrence of useful agronomic characteristics and developing new varieties.
The development of molecular markers capable of discriminating high doses of the “B” genome in bananas is essential for determining gene composition and inferring important characteristics in hybrids [
22]. The PDSMb primer developed in this study proved to be useful for detecting the ploidy of cultivars developed in the Embrapa breeding program. This marker effectively identified the “B” genome in the gene composition of the different accessions; even if the sample has only 25% of the B genome in its ploidy, the primer will not detect and not reveal the second ~225 bp band (
Figure 3,
Table 2).
MedCalc software (
https://www.medcalc.org/calc/diagnostic_test.php) was used in the molecular analysis of the PDSMa and PDSMb markers. This software is used in the health sector for disease diagnosis and can be adapted for use in plants. In the program you need to fill in information about true positives, false negatives, false positives and true negatives. This way, it is possible to extract statistics on sensitivity, specificity, likelihood ratio (negative or positive) and accuracy of the polymorphic fragment found.
Of the 150 accessions subjected to PCR, 145 showed bands at 2166 bp for the PDSMa primer, indicating the presence of the “A” genome, and 92 showed bands at ~225 bp for the PDSMb primer. These samples were classified as true positive in the MedCalc analysis.
Only one sample, the Teparod genotype (ABBB), was identified as a false negative for the PDSMa primer, because the “A” genome in its composition was not identified by band amplification in this region. This result corroborates the occurrence of homologous recombination between “A” and “B” genome cultivars, suggesting that, in this specific case, the Teparod genotype may not be carrying the full complement of the “A” genome [
13,
38,
39]. There were no false negatives for PDSMb, as band amplification (~225 bp) occurred in all cultivars with 100% “A” ploidy.
None of the samples with the “B” genome showed the 2166 bp fragment in PDSMa, nor the second band of ~225 bp in PDSMb, representing a false positive. Four samples containing the “B” genome showed no amplification with PDSMa, and 58 with “B” ploidy genotypes showed no amplification for the second band with PDSMb, making them true negatives. We calculated the static parameters of the PDSMa and PDSMb markers, which showed 99.32% and 100% sensitivity, 100% specificity, 100% positive predictive value, 80% and 100% negative predictive value, and 99.33% and 100% accuracy, respectively, indicating that the PDSMa marker is highly effective in discriminating “B” genome doses >75% in banana genotypes and that the PDSMb marker can identify accessions with 100% of the “A” genome in their ploidy (
Table 3).
The use of molecular markers to determine the genomic composition of
Musa cultivars and other crops have many advantages over morphological markers [
40]. Several studies have used molecular methods to identify the genomes of
M. acuminata and
M. balbisiana. Nwakanma et al. [
15] and Jesus et al. [
13] identified molecular markers based on internal transcribed spacers (ITS), which discriminated “A” from “B” genomes in bananas, but not very accurately. Hollingsworth [
41] showed that markers based on ITS regions were vulnerable to co-amplification with fungal DNA, leading to misidentification and multiple, possibly divergent, ITS copies in a single specimen.
Mabonga and Pillay [
42] developed a 500 bp SCAR marker based on a RAPD marker to identify the “A” genome in bananas and plantains. Although the marker was useful for identifying the “A” genome, a 700 bp fragment hybridized with all the genotypes and impeded the differentiation of “A” and “B” genomes. Many primers have been obtained by converting RAPD markers into SCAR markers. However, this conversion generally leads to a decreased level of polymorphism [
43], particularly with different genetic backgrounds.
The identification of genotypes with B genome doses based on the absence of a band is also valuable for predicting BSV disease onset, which is mainly caused by three virus species:
Goldfinger (
eBSGFV),
Imovè (
eBSIMV), and
Obinol’Ewai (
eBSOLV). Because of viral introgression in the B genome (
eBSV), caution must be exercised in the cultivation of “B”genome accessions since disease onset can be stimulated by the several
in vitro subcultures required by the crop, which is vegetatively propagated, and external plant stresses, such as low temperatures [
9,
44]. Thus, early identification of the “B” genome could be instrumental in enhancing crop management practices in the agricultural field.
The highly accurate PDS gene markers developed in this study to discriminate the “A” and “B” genomes in bananas represents a useful new tool for the genetic improvement of Musaceae crops, particularly due to its origin from a highly conserved gene with few copies. These markers can potentially be used in the molecular characterization of germplasm collections and new accessions to expand the genetic diversity of the crop, which would be useful in discriminating between controlled and uncontrolled crosses and providing information for seedling exchange. These activities constitute the basis of genetic improvement programs for bananas.
Author Contributions
Conceptualization, F.d.S.N. and M.S.M.; methodology, F.d.S.N., M.S.M., A.d.J.R., J.M.d.S.S., C.F.F., T.A.d.O.M. and E.P.A.; software, F.d.S.N., M.S.M. and A.d.J.R.; validation, E.P.A. and C.F.F.; formal analysis, F.d.S.N. M.S.M. and E.P.A.; investigation, F.d.S.N., A.d.J.R., J.M.d.S.S., A.P.d.S.R., M.S.M., S.C.B. and C.C.H.d.S.; resources, E.P.A. and C.F.F.; data curation, F.d.S.N.; writing—original draft preparation, F.d.S.N., E.P.A. and C.F.F.; writing—review and editing, E.P.A., C.F.F. and L.E.C.D.; visualization, E.P.A. and C.F.F.; supervision, E.P.A., C.F.F., J.A.d.S.-S., T.A.d.O.M. and L.E.C.D.; project administration, E.P.A.; funding acquisition, E.P.A. All authors have read and agreed to the published version of the manuscript.