1. Introduction
Cassava (
Manihot esculenta Crantz, Family: Euphorbiaceae) is an important root crop, widely cultivated in Africa [
1] for its tuberous roots rich in starch [
2,
3] and its leaves rich in protein, minerals, vitamins and carotenoids [
4]. It is an important food security crop, particularly to smallholder farmers in Africa [
5]. In 2021, the production of cassava in Africa was estimated of 203.57 million tons, representing 64.67% of world’s production [
6]. The crop is increasingly gaining in popularity due to its capacity to give better yields than most of the crops in the drought prone ecologies, in poor soils [
7] and its flexibility in planting and harvesting times [
8].
Cassava is an allogamous species [
9]. In traditional farming systems, the coexistence of different cassava accessions in the same or neighboring fields is common. This coexistence leads, thanks to cross-pollination, to an increase in genetic diversity in fields [
10]. In addition, the presence of a high diversity of accessions in the fields due to the exchange of planting materials between farmers is very frequent [
11]. As a result, depending on the collection localities, different accessions may have the same name, while an accession could be given different names. This leads to the presence of duplicates among accessions collected in different localities [
12]. The ability to identify and remove duplicates from a collected germplasm is very important for breeding activities. In addition, the success in a breeding program depends on a good understanding of the genetic variability within the existing population. Therefore, it is important to carry out studies to identify duplicated accessions and assess the genetic diversity within accessions in order to provide breeding programs with unique genotypes [
1,
13]. An assessment of cassava genetic diversity has been done using morphological descriptors [
14,
15] and molecular markers [
1,
16,
17,
18]. However, morphological descriptors are known to be affected by the interaction between genotype and environment. On the other hand, the molecular markers are stable, easily detectable and not influenced by the environment [
19,
20]. Various molecular tools can be used to assess the genetic diversity of crops, including Random Amplified Polymorphic DNA (RAPD) [
21], Restriction Fragment Length Polymorphism (RFLP) [
21], Amplified Fragment Length Polymorphism (AFLP) [
22], Simple Sequence Repeat (SSR) [
1,
19,
23], Single Nucleotide Polymorphism (SNP) [
13,
18,
24,
25,
26], and Diversity Arrays Technology (DArT) [
27,
28]. The locus specific markers such as SSR markers have found its preferential application in genetic diversity and population structure assessment in many crops [
1,
29,
30,
31]. With the possibility of whole genomes sequencing and of detecting single nucleotide polymorphism (SNP), SNPs are also gained in importance in genetic diversity and population structure studies [
9,
13,
18,
24].
Genomic analysis and identification of potential duplicate accessions in cassava germplasm based on SNP has been done in Burkina Faso. A high rate of potential duplicates (52.41%) and a complex genetic structure of accessions was observed [
32]. The polymorphisms of SSRs and SNPs are generated via different mechanisms and the two types of marker can therefore provide different views of the diversity of a given population [
33]. A total of 132 accessions were selected from Burkina Faso cassava germplasm and genotyped using SSR markers in order to estimate the genetic diversity and the number of unique multilocus genotypes.
4. Discussion
Understanding genetic diversity of species is the basis of the success of any breeding program and to develop strategies for germplasm management, conservation, and improvement [
28]. Assessment of genetic variability of a given population in order to provide breeding programs with interesting parental lines is a very important pre-breeding operation and must take into account the morphological and molecular variabilities in an existing population. Genetic diversity studies using morphological traits alone are limiting because of the interaction between environment and genotype effects [
50]. These limitations may not allow the accurate detection of duplicates. According to Collard et al. (2005) the use of molecular markers can permit the detection of genetic differences among closely related genotypes. In addition, assessment of the agro-morphological diversity of cassava requires a great deal of space, depending on the number of accessions, and is spread over several months (9 to 12 months) [
15,
28]. It is therefore advisable to assess molecular diversity within the germplasm and to identify the unique multilocus genotypes first, before assessing agro-morphological diversity.
Molecular markers need to be chosen appropriately to be ubiquitous, reasonably polymorphic, reproducible, and easily detectable [
39] like SNPs and SSRs. In practice, there is no perfect molecular marker method that satisfies all expectations and does not present any challenge with its application. The choice of which marker technique to apply depends strongly of some factors such as the set objective, the level of the genetic variability of the population, the sample size, the accessibility of primers, the availability of the technical know-how and appropriate facilities, time, and financial considerations [
52,
53]. In addition, the number of alleles depends on the type of marker. For example, SNP markers have a fixed number of alleles while SSR markers can have many alleles per locus [
1]. Whatever the type of marker used, it is important to determine the minimum number of markers that can efficiently discriminate the maximal number of accessions [
40].
Genomic analysis of cassava accessions and identification of potential duplicate accessions based on SNPs done in Burkina Faso revealed a high rate (52.41% ) of potential duplicates [
32]. This high rate allowed us to genotype the accessions using SSR markers in order to estimate the genetic diversity and the number of unique multilocus genotypes (MLGs) in Burkina Faso cassava germplasm. The 132 accessions were randomly selected from the germplasm coming from major cassava growing regions and genotyped using 32 SSR markers. The genotype accumulation curve showed that the 32 SSR markers were sufficient for the discrimination of the 130 accessions. Moreover, it revealed the presence of 83.8% of unique multilocus genotypes (MLGs) among the population. This rate was higher than the rate of MLGs (47.6%) found in previous studies [
32] despite the less number of accessions used in this study. These results indicate that the 32 SSR markers have a greater capacity to estimate the number of MLGs than the 34 SNP markers used in the previous study.
The results of the analysis of genetic diversity parameters of the 130 accessions showed that the 32 SSR markers were polymorphic with 0.40 as mean value of PIC. This value was higher than that reported by Moyib
et al [
54] but lower than those reported by other authors [
1,
19,
55,
56]. These differences could be explained by the specificity of each cassava germplasm studied and the SSR markers used. Furthermore, the mean PIC value observed in this study was higher than those observed previously in Burkina Faso using SNP markers. This difference could be explained by the bi-allelic nature of SNP markers unlike SSR which are multi-allelic [
18]. Indeed, the number of alleles per loci in this study ranged from 2 to 6. The average H
o in this study was higher than H
e, suggesting a heterozygote excess within the 130 cassava accessions. This excess of heterozygote was confirmed by negative values of the F
IS and F
IT. In addition, an excess of heterozygosity in cassava populations has been reported in several studies [
1,
19,
55,
56].
Molecular profiling of accessions revealed a low rate of duplicates (16.2%) in this study compared to the previous study in which 52.41% of duplicates was found [
32]. This could be explained by the few SNP markers used in the previous study (34 SNP markers). Indeed, given the multi-allelic nature of the SSR markers and the bi-allelic nature of the SNP markers, more SNP markers may be needed when compared with SSR markers to achieve the same degree of resolution [
39,
40]. PCoA was not able to differentiate cassava accessions according to the origin. In addition, DAPC performed using the regions as predefined groups did not reveal a clear differentiation of accessions according to the origin. This absence of differentiation was confirmed by the low values of the genetic differentiation index (F
ST) which was 0.025. Furthermore, the AMOVA results indicated that 93.69% of molecular variation were found within individuals with only 6.31% between regions. This could be due to the fact that some accessions are grown in several regions in Burkina Faso [
32]. The analysis also revealed a weak differentiation of the accessions according to breeding patterns with a low value of F
ST (0.008). This absence of differentiation is probably due to the fact that most of the improved varieties are grown in cassava fields [
32]. The dendrograms obtained by the hierarchical clustering showed that the 130 cassava accessions can be grouped into two large clusters. As mentioned in the previous study [
32], this truncation may not reflect the real structure of the population, given that the truncation was done at the top of the dendrogram. The number of clusters obtained using Bayesian approach (5 clusters) in this study was higher than that obtained in previous study (2 clusters). That could be due to the fact in the number of duplicates accessions was low in this study. Several studies argued that the low rate of duplicate accessions could improves the accuracy of the Bayesian approach [
32,
57]. The DAPC performed on the 130 cassava accessions divided the accessions into 13 clusters with an individual assignment probability (100%). The difference between the results of the Bayesian approach and the DAPC could be due to the multivariate approach used by the DAPC and the fact that Bayesian approach is based on the Hardy–Weinberg equilibrium (HWE) model. However, for vegetatively propagated species such as cassava, this equilibrium is not often respected [
9,
57,
58]. It was found that nearly 70% of molecular variance was between the clusters formed by DAPC, compared to only 30% within the accessions. In contrast, the molecular variance between clusters formed by Bayesian approach represented 47% compared to 53% within the accessions. As a result, DAPC could be more suitable as it uses an approach that can assess genetic structures in the absence of any assumptions about the genetic model of the population [
32,
42]. DAPC performed in this study suggested a number of clusters (13 clusters) less than that suggested by Soro
et al (17 clusters) [
32]. This could be due to the number of accessions used in this study (130 accessions) less than that used by Soro et al (166 accessions). The analyses carried out on 104 accessions genotyped using SSR and SNP markers revealed the same number of clusters (10 clusters) with a higher individual assignment probability (100%) of accessions into clusters for the two types of markers (
Figure S1 and
Figure S2). For both marker systems (SNP and SSR), the same number of clusters were observed by several authors by using different genetic structure assessment methods [
59,
60]. These results could be very useful for laboratories with limited resources. SSR markers are available for several crops and the SSR genotyping technique can be implemented in any molecular biology laboratory.
Author Contributions
Conceptualization, M.S., F.T., and K.S.; methodology, M.S., J.S.P.; formal analysis, M.S., D.H.O., and S.M.F.W-P.; data curation, M.S.; writing—original draft preparation, M.S.; writing—review and editing, F.T., K.S., J.S.P., J.B.N., and D.K.; supervision, D.K., J.S.P and F.T; project administration, F.T., and J.S.P.; funding acquisition, F.T., and J.S.P. All authors have read and agreed to the published version of the manuscript.