1. Introduction
The field of genetics and genomics has evolved greatly in the wake of ongoing technological advancements [
1,
2,
3]. Consequently, diverse methods have arisen to investigate genetic diversity. Some of these methods gain popularity and momentum only to be replaced by subsequent more effective and faster techniques, e.g. allozymes [
4], Restriction Fragment Length Polymorphisms (RFLPs [
5]) while some methods such as microsatellites endure the tests of time [
6,
7]. The development and subsequent availability of high-throughput sequencing methods (“next generation sequencing”- NGS) has once again changed the game for conservation genetics, providing highly informative and precise genetic information for diverse applications [
1,
3]. These methods such as SNP’s, genotyping-by sequencing (GBS), Restriction-site associated DNA sequencing (RADseq [
8]), Multiplexed ISSR genotyping-by-sequencing (MIG-seq [
9]) etc. provide invaluable insights into aspects such as population diversity, dynamics and viability, identifying taxonomic and management units, detecting local adaptation, predicting genetic effects, all of which can greatly inform conservation decisions [
1,
3,
10].
However, these high throughput NGS methods are often prohibitively expensive in developing countries due to a lack of infrastructure, skill and resources, as well as unfavourable exchange rates for equipment and consumables [
2,
11,
12]. Consequently, these countries have to resort to cheaper genetic markers [
7]. These countries are coincidentally also often biodiverse, containing disproportionately high levels of rare taxa, and experience great human or climate related pressures, further increasing extinction risks of endemic or rare species in these countries [
13].
Because of this financial and skills inequity, conservation research is not performed in areas where it is most needed [
11,
12]. Conservation spending (and budgets) correlate to the level of research performed and reduction in the rate of biodiversity loss in these countries [
12,
14]. It is thus imperative to provide support for these countries, while also investigating and implementing simpler and more cost-effective genetic methods more accessible to local researchers and institutions [
2,
12].
One such method is Inter Simple Sequence Repeats (ISSRs [
15,
16]). ISSR markers are generated from PCR reactions using flanking microsatellite regions as priming sites [
15,
16]. The ubiquity and variability of microsatellites in eukaryotes mean many priming sites are available throughout the genome leading to increased resolution and almost full genome coverage, all whilst requiring no
a priori sequence information [
15,
16,
17,
18,
19,
20,
21,
22]. The main benefits of ISSRs are their cost, speed and simplicity compared to other methods [
6,
17,
23,
24]. Moreover, the use of PCR allows the rapid generation of large volumes of markers from only a small amount of DNA [
20,
24]. ISSRs are also highly sensitive markers suitable for discriminating closely related species and investigating intraspecies variation [
21,
25,
26]. ISSRs thus offer a higher degree of resolution compared to other “fingerprinting” molecular methods [
21]. Compared to RFLPs, and RAPDs, ISSR markers give similar results but produce more extensive and informative datasets for less cost and time, and less labour [
17,
24]. In contrast, methods, such as AFLP markers, although more reproducible and accurate, are more costly and complicated [
17]. ISSRs are also more reproducible than RAPD markers [
17], but are less reproducible than RFLPs [
24]. ISSRs are thus very useful molecular markers in ecological, genetic diversity and even systematic studies due to their hypervariable nature and low cost [
20]. As a consequence of these factors, ISSR’s are used widely in developing countries for a range of purposes. A search of the SCOPUS data base (11 March 2024) using the search terms (“inter AND simple AND sequence AND repeat*) OR ISSR*” found 7852 publications. An analysis of the author affiliations of these papers using VOSViewer ver. 1.6.15 [
27] indicated that the vast majority of these papers were from researchers in developing countries, many of which are also mega-biodiverse (
Table 1).
ISSRs have proven valuable in a wide range of applications including hybridisation and taxonomic studies [
21,
25], phylogeny reconstruction [
28], population genetic studies [
29,
30,
31,
32], demographics [
33] the investigation of the mating systems and reproduction of plants [
34], sex determination [
35,
36], distinguishing ecotypes [
37], as well as studies on crops and crop relatives and medicinal plants [
38,
39,
40,
41] and identifying markers for traits such as toxin production or phenotypes [
35,
36,
42]. Of particular relevance to our study, this method has also been applied to rare and endangered or endemic species (Xue et al. 2004; Luan et al. 2006; Pinheiro et al. 2012; Liu et al. 2013; Bentley et al. 2015; Tian et al. 2018) as well as widespread and common species [
46].
The vast majority of ISSR studies utilise conventional agarose gel visualisation of banding patterns, a very cheap and readily available technology, which perhaps explains the extensive use of this method in developing countries. However, another benefit to ISSR fingerprinting is that the primers can be modified by labelling with fluorescent dyes that allow for the automated detection of bands using DNA sequencing machines [
47]. This modification of the primers and use of slightly more costly automated detection systems provides greater sensitivity and resolution of bands coupled with the ability to accurately size much larger ISSR fragments, resulting in larger datasets and more accurate fragment sizing potentially able to differentiate fragments with as little difference as a single nucleotide [
48,
49]. Owing to the higher sensitivity of the automated process, much larger datasets are produced, but possibly with lower marker informativeness [
37]. However, despite being available since approximately 2004 (e.g Schrader and Graves 2004), the advantage of obtaining larger datasets and ensuring band sizes are comparable, this approach has not been widely used. However, automated ISSR fingerprinting has been used effectively in plantains (
Musa L. sp. [
37]), cotton (
Gossypium L. [
50]),
Vachellia karroo (Hayne) Banfi & Galasso [
51], the endemic and widespread species within
Tolpis Adans. (Asteraceae [
47]) and endangered
Faucaria tigrina Schwantes (Aizoaceae [
45]).
Based on the above considerations and the merits of ISSR’s, this study employs automated ISSR fingerprinting to determine the genetic diversity of the African cycad
Encephalartos eugene-maraisii I. Verd. species complex and to ascertain whether genetic diversity corresponds to currently defined taxonomic groups in this complex. Of relevance to this study is the fact that ISSRs have previously been used in cycads for a wide range of applications (
Table S1), but their use along with automated fragment detection has yet to be applied to cycads.
1.1. The Conservation Status of Cycads in Africa–Encephalartos as a Case Study
The African cycad genus
Encephalartos Lehm. is considered the most threatened cycad genus globally and the most threatened group of organisms in South Africa, with 12 of 37 (32%) species in South Africa listed as Critically Endangered (compared to the global average of 17% in cycads), and an additional four which are endangered [
52,
53]. Moreover, the five cycad species that are listed as Extinct in the Wild by the IUCN are from the genus
Encephalartos, all of which once occurred within the borders of South Africa (
E. brevifoliolatus Vorster, E. nubimontanus P.J.H.Hurter
, E. woodii Sander
., and
E. heenanii R. A. Dyer) or landlocked Eswatini (= Swaziland,
E. relictus P.J.H.Hurter,). Additionally, South Africa is an important cycad diversity hotspot and site of endemism containing 58% of
Encephalartos species, of which 80% are endemics [
54,
55,
56].
This South African cycad extinction crisis [
55,
57,
58] may result in South Africa losing 50% of its species within 2-10 years [
59]. This extinction is driven by poaching for the ornamental plant trade, the harvest of specimens for medicinal, recreational, and magical purposes [
52,
53,
54,
59,
60,
61,
62,
63], as well as pathogens [
64], herbivory [
65], and pollinator extinction [
56,
66]. Moreover climate change, leading to greater environmental stochasticity and subsequent susceptibility to pests and pathogens [
13,
65,
67] also poses a threat, as well as habitat fragmentation and destruction, the spread of alien invasive species and reproductive failure [
54,
55,
68]. Conservation of this group has thus never been more urgent.
Despite much activity in South African cycad conservation and research (Bamigboye et al., 2016; Cousins & Witkowski, 2017; NEM:BA Notice 317, 2017; Okubamichael et al., 2016; Osborne, 1995; Raimondo et al., 2013), there remains limited knowledge about even the most basic aspects of cycad biology or population size and trends for many species [
54,
71,
72,
73]. In addition, research directed at assessing the genetic diversity of South African cycads is required, as little work has been done on these taxa [
74,
75]. Moreover, the taxonomic relationships between some species, especially among closely related taxa, need to be resolved, thereby allowing the correct designation of conservation status for these taxonomic units [
54,
71,
76,
77,
78]. Much of the taxonomically unresolved portions of the genus occur within species complexes containing recently diverged taxa [
54,
55].
Species complexes comprise groups of closely related species which often co-occur or have close geographical proximity. Owing to morphological and genetic similarities, members of these complexes are often difficult to distinguish, which can lead to unclear or biased species delimitation or incorrect designation of conservation units [
78]. These complexes are additionally enigmatic in that morphological distinctness does not necessarily correlate with genetic differentiation of species, with the opposite occasionally true [
79]. Several examples of cycad species complexes exist (e.g. Sharma et al. 2004; Vovides et al. 2004; Xiao et al. 2006). In the genus
Encephalartos, such complexes include the
E. hildebrandtii A.Braun & C.D.Bouché species complex of East Africa [
83], as well as a group of mostly glaucous
Encephalartos species in the Eastern Cape Province of South Africa [
79], and the glaucous cycads comprising the
Encephalartos eugene-maraisii complex occurring in the northern escarpment of South Africa [
84].
1.1.1. The Encephalartos Eugene-Maraisii Complex
The
Encephalartos eugene-maraisii I Verd. complex is a group of closely related cycads with glaucous foliage occurring mainly in the Limpopo and Mpumulanga provinces of South Africa (
Figure 1). Members of the complex comprise
E. eugene-maraisii, E. dolomiticus Lavranos & D.L.Goode
, E. middelburgensis Vorster, Robbertse & S.van der Westh.
, E. dyerianus Lavranos & D.L.Goode
, E. cupidus R.A. Dyer and
E. nubimontanus P.J.H.Hurter and potentially
E. hirsutus P.J.H.Hurter (
Table 1A). In this complex, the taxonomic relationships are uncertain and there is considerable morphological variation within the complex, with some species having as many as eleven different variants recognised (formally and informally) by collectors and growers (de Klerk, 2004,
Table 1A). Within this pool of variation, there may lie undescribed species, or alternatively, species which require merging.
Broad taxonomic and phylogenetic studies on
Encephalartos place the
E. eugene-maraisii complex into a single clade with little to no resolution and weak support between the member species [
76,
86,
87]. The morpho-geographical classification of
Encephalartos proposed by Vorster (2004) places the complex, with the possible inclusion of
E. hirsutus, in the same grouping. Molecular studies by Stewart et al. (2023) and Mankga et al. (2020) supported the exclusion of
E. hirsutus from the
E. eugene-maraisii complex, but provided no further insight into the molecular or taxonomic relationships between members of the complex. Species members of this complex were not well represented in these studies comprising singletons, pairs, or being absent entirely. This likely had consequences for phylogenetic resolution for this group, particularly since these taxa are closely related [
88].
2. Materials and Methods
2.1. Sampling
Young, but hardened off leaflets from the six Encephalartos species of the E. eugene-maraisii complex, as well as two samples of E. hirsutus were sourced from a private cycad collection in White River in the Mpumalanga province of South Africa and at the cycad genebank of the South African National Biodiversity Institute’s (SANBI) Lowveld Botanical Gardens (Mbombela, Mpumalanga, South Africa) on 30 September and 1 October 2021. Additional samples from University of Pretoria cycad collection, and several private collections in Pretoria were collected on 11 May 2022. To ensure correct species identification, selected plants were cross-referenced with specimen records from each of the gardens and species identity was confirmed visually by Mr A. W. Frisby (curator of the University of Pretoria’s cycad collection). Care was taken to sample plants originating from as many disjunct localities as possible (where locality data for the individuals was available). Suspected hybrids were omitted from the study. Collected leaflets were temporarily stored in paper envelopes and refrigerated until they could be transferred to individual ziplock bags containing silica gel for desiccation.
2.2. DNA Extraction
Approximately 30 mg of silica-dried material per sample was ground with metal beads using the Geno/Grinder 2010 (Spex Sample Prep) and extracted in two batches of 96 samples using Sbeadex Maxi Plant kit and the Oktopure robot (LGC Biosearch Technologies) in the labs of the Forest Molecular Genetics, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria. These runs additionally included duplicate sample pairs representing material of the same plant, but extracted in both batches to test for consistency in extraction runs. DNA purity and contamination was determined spectrophotometrically by calculating the ratio of absorbance at 260 nm to that of 280 nm, and the ratio of absorbance at 260nm to that of 230nm (Nanodrop, Thermofischer Scientific).
2.3. PCR Optimisation and Selection of ISSR-PCR Primers
Six ISSR primers manufactured with 5’ fluorescent labels were screened for their suitability in amplifying cycad DNA (
Table 2). Primers in the trial PCR runs which consistently produced mostly bright, clear bands for a wide range of samples when viewed under UV on 1% agarose gel, were selected for this study. Selected primers were also optimised under different MgCl
2 concentrations.
2.4. PCR Reaction Conditions
Following primer selection, bulk amplification of sets of 96 samples was performed in a Bio-Rad T100 thermal cycler under the following reaction conditions: The 25μl reaction mixture constituted 4μl DNA, 25nmol ISSR primer suspended in 1 μl TE buffer, 12.5 μl 2x Ampliqon Master mix (Ampliqon Taq DNA polymerase, 0.4mM each dNTP, Tris-HCl pH 8.5, (NH4)2SO4, 3mM MgCl2, 0.2% Tween™ 20, inert red dye and stabilizer) and 4μl distilled water. The PCR reaction was performed with an initial denaturation phase at 96˚C for 2.5 minutes, followed by 30 cycles of 96˚C denaturation for 30 seconds, 52˚C annealing for 30 seconds and 72˚C extension for 2 minutes, and ending with a 2-minute extension at 72˚C.
PCR products for each sample and selected primer were visualised under UV on 1 % agarose stained with ethidium bromide to discern amplification success, and samples with no visible bands for any of the selected primers were omitted from subsequent analyses. PCR products from remaining samples were sent to Central Analytical Facility, Stellenbosch University for capillary electrophoresis and automated detection using an ABI 3130 genetic analyser equipped with fragment profiling software. The 1200LIZ size standard was used allowing fragment size estimation between 20 and 1200 base pairs. The electropherogram from each sample was analysed using Genemapper software Version 5 (Applied Biosystems).
2.5. Construction of Data Sets for Analysis
Three datasets per primer were produced using Genemapper, which was used to select bands based on the user-defined fluorescence cut-off values 50, 100 and 200 relative fluorescence units (rfu). These datasets represent varying levels of sensitivity to band intensity with 50rfu cut-off being the most sensitive, scoring faint to bright bands, and potentially including spurious bands, and the least sensitive, 200rfu, which scored only bright bands. Bands brighter than the cut-off values were scored “1” for presence and “0” for absence and invariant alleles were omitted. The resultant binary datasets were saved as spreadsheets for further analysis. Pooled datasets at each cut-off level containing the binary data for all primers were then created. These data were examined for the distribution range of band number (considered to an indicator of amplification success). Based on these analyses, a quarter (25%) of the samples which had the lowest number of detected bands were excluded from the subsequent analyses, as the PCR was deemed to be only partially successful or unsuccessful.
2.6. Methods to Assess Genetic Similarity and Diversity
Four different methods were chosen to analyse the data. These were cluster analysis (also called numerical taxonomy or phenetics [
89]), median-joining network analysis [
90] and STRUCTURE analysis [
91] using Bayesian Markov chain Monte Carlo (MCMC) estimation, as well as statistical analysis employing Analysis of Molecular Variance (AMOVA) and Tajima’s D statistic [
92].
2.6.1. Cluster Analysis
Genetic distance matrices were computed using various distance coefficients in NTSYS-PC version 2.02k [
93] using the SIMQUAL option (
Table S2). Distance matrices were clustered using the Unweighted Pair Group method with Arithmetic Averages (UPGMA), and the Neighbor joining (NJ) method. To determine the appropriate clustering method and distance coefficient for the data, dendrograms were compared visually for the “logical” clustering of samples (i.e., the somewhat subjective assessment of grouping of samples as species clusters) as well as computationally through cophenetic correlation analysis and normalised Mantel test [
94].
2.6.2. Statistical Analysis
Analysis of Molecular Variance (AMOVA) was also performed with PopART to assess the distribution of observed genotype variation and Φ
ST values calculated for 1000 permutations of ISSR haplotypes among populations, as well as Tajima’s D statistic, to detect the presence of non-random evolution in the gene pool [
92].
2.6.3. STRUCTURE Analysis (Bayesian MCMC)
The Bayesian clustering of the populations was assessed using STRUCTURE software version 2.3.2.1 [
91] for each of the three pooled datasets. Five independent runs with a 1000 iteration burn-in and MCMC chain of 10 000 generations, were run with the number of populations (K) ranging from one to ten. Alleles were treated as haploid and the allele frequencies were set to be correlated, using the admixture protocol. Another five independent runs were performed on these K-values using the LOCPRIOR model in STRUCTURE which accounts for locality data prior to the commencement of the run. For the sake of this study and due to a lack of precise locality information for all samples, each species was considered to be a single locality. All runs were performed with the admixture model setting and with allele frequencies correlated. The optimal K-value, generally considered the smallest K for which the probability of the observed data is maximised, was determined using STRUCTURE HARVESTER [
95] based on the method developed by Evanno et al. (2005).
2.6.4. Network Analysis
Haplotype network analysis was performed in PopART software (
http://popart.otago.ac.nz; [
96]) using the Median Joining method with epsilon set to zero [
90].
4. Discussion
This study has demonstrated the value and utility of automated ISSR fingerprinting for investigating genetic variation among closely related
Encephalartos species. The use of fluorescently tagged ISSR markers and fragment profiling software allowed the rapid generation and scoring of numerous bands with minimal setup, while being sufficiently sensitive to discern most of the species in the
Encephalartos eugene-maraisii complex. While not as cheap as using conventional agarose gel electrophoresis for band visualisation, the accurate band sizing and increased sensitivity of band detection makes the additional expense of automated detection worthwhile [
22]. Other studies also reported approximately three times the number of loci or bands per primer than conventional agarose or capillary electrophoresis methods [
37,
47]
The 50rfu dataset proved the most informative for our data, with the exception of AMOVA and cluster analysis where results between the 50rfu and 100rfu dataset were comparable. STRUCTURE and network analyses were less informative at higher cut-off values. The exclusion of the lower 25% of samples additionally reduced the occurrence of noise and unexpected clustering of data. In our study, the duplicate samples provided insight into the variability of extraction and amplification success of samples and additionally, through seeing if duplicates clustered together, they assisted in identifying effective and ineffective clustering methods. Duplicate pairs were observed to cluster closest to one another in the 100rfu dataset in the cluster and network analyses. Positions of duplicates in STRUCTURE analyses were uncertain. Curiously, duplicates were observed to be most separated in 50rfu datasets, performing slightly worse than 200rfu datasets. Analyses based on the 100rfu dataset, however, produced less discrete species groupings than the 50rfu dataset, suggesting that potentially spurious bands in the latter dataset affected fine scale resolution of the analysis, but not broader species groupings. This may additionally suggest that selection of an intermediate -rfu cut-off between 50- and 100rfu may produce a dataset which is a more accurate representation of both the broad- and fine-scale variation in this species complex.
4.1. General Findings
4.1.1. Statistical Analysis
The Tajima’s D statistic was negative for all datasets, although not significant. Negative D statistics are sometimes associated with populations which underwent recent population expansion following a bottleneck, or populations with numerous rare alleles [
92]. In this case, however, it may indicate that this complex is a recently diversified group. This concurs with molecular clock data on
Encephalartos indicating a recent radiation of this group [
86,
97,
98].
The AMOVA analysis revealed higher intraspecific than interspecific variation. High within population variation is also reported in other species of cycads [
99,
100,
101]. The Φ
ST of 0.3 indicates moderate genetic isolation of the species. This, however, is unsurprising as Φ
ST is used mainly for population level variation, thus, genetic differentiation between these species is presumed. The 100rfu dataset was able to detect the most variation between populations and possessed the highest fixation value of all datasets. The Φ
ST fixation index ranging from 0.3 to 0.6 shows higher genetic differentiation than reported in other closely related
Encephalartos species, where values approaching a value of one have complete genetic differentiation. Van der Bank et al. (2001) reported a value of 0.1 (FST) for
E. horridus (Jacquin) Lehmann;
E. latifrons Lehmann;
E. lehmannii Lehmann;
E. longifolius;
E. princeps R. A. Dyer; and
E. trispinosus (Hooker) R. A. Dyer.
4.1.2. STRUCTURE Analysis
The STRUCTURE analysis revealed distinctive groupings of
E. eugene-maraisii,
E. dolomiticus,
E. dyerianus, and
E. hirsutus, concurrent with current species definitions.
E. hirsutus samples shared very few alleles with other species groups, emphasising its singularity from the
E. eugene-maraisii complex.
E. nubimontanus, although distinctive, formed two groups, one of which showed similarity to
E. cupidus. This may indicate historical geneflow between these species.
E. nubimontanus contains many morphological variants [
85], some of which (such as “Robusta”) were contested to instead be forms of
E. cupidus [
102]. It is possible that the species had historically overlapping ranges resulting in morphological and genetic overlap between these species.
E. middelburgensis samples were composed of proportions of the other groups and were not distinguishable from other samples based on their allele frequencies, suggesting this is not a valid taxon. Alternatively, the ISSR PCR resulted in poor amplification of this species. Although most individuals possessed allele frequencies typically corresponding to that of conspecifics, there were exceptions where isolated samples had allele frequencies more closely corresponding to that of other species groups. Allele frequencies of some members of
E. dolomiticus and
E. cupidus, for instance, closely resemble that of
E. hirsutus samples. These individual samples were also unexpectedly clustered with their other species groups in the dendrograms and network analyses (
Figure 6;
Figure S3). Although the results suggest that these species are genetically indistinguishable from
E. hirsutus, this is unlikely since
E. hirsutus is phylogenetically distinct from the
E. eugene-maraisii complex [
86,
103]. The behaviour of these samples may instead reflect missing alleles or numerous spurious bands since samples deemed genetically similar to
E. hirsutus samples all possess approximately 50 bands, whereas
E. hirsutus samples possessed 83 and 73 bands respectively. Another consideration not to be overlooked is a limitation of STRUCTURE analyses in that it assumes Hardy-Weinberg equilibrium and no inbreeding [
91,
104]. Due to these cycads’ historical isolation and present cultivation in ex-situ populations, panmixia of these individuals cannot be assumed.
4.1.3. Cluster Analysis
In the cluster analysis E. nubimontanus formed two separate groups in most trees similar to the STRUCTURE analysis. E. cupidus often grouped within or had groups alternating with E. nubimontanus clusters. Another general tendency was for E. eugene-maraisii samples to cluster together, with E. dolomiticus clustered within this group. This suggests E. dolomiticus’s origin as a subpopulation of E. eugene-maraisii which underwent genetic isolation. These species do, moreover, share some morphological characteristics, further supporting this theory.
The choice of clustering method did not appear to affect tree topology dramatically. NJ analysis allows varying rates of evolution between species, while UPGMAs assume fixed rates of evolution [
23]. Mankga et al. (2020) suggest evolutionary rates among
Encephalartos are constant and that a constant-rate-diversification model may be most suitable for analyses. This may explain why little difference was observed between trees using each clustering method. Authors such as Archibald et al. (2006) opted for NJ analyses using the Dice coefficient [
105], an approach we have chosen to follow here. Notable differences in our analyses between NJ and UPGMA include the clustering of
E. eugene-maraisii samples together into one group in NJ dendrograms, while in UPGMAs
E. dolomiticus formed a nested group within
E. eugene-maraisii. Moreover, in the NJ analysis,
E. hirsutus formed a solitary cluster branched from all other samples, indicating its apparent genetic distinctness from the rest of the samples, compared to the more central position of
E. hirsutus samples in the UPGMA. The positions of
E. nubimontanus and
E. cupidus in NJ dendrograms also differ from the UPGMA trees in that they clustered together as overlapping groups.
4.1.4. Network Analysis
The network analysis differed notably from the cluster analysis in its placement of E.
cupidus and
E. nubimontanus, where networks show more dissimilarity between these groups.
E. dolomiticus, E. cupidus and
E. dyerianus also appear to branch directly from
E. eugene-maraisii suggesting these groups differentiated from
E. eugene-maraisii and that these populations were once linked. However, network analyses such as these merely provide insights into the genetic similarity of individuals and should not be used to infer phylogeny [
96].
4.2. Species Delimitation within the E. eugene-maraisii Complex
Our analyses provide further support for the taxonomic singularity of E. hirsutus and its exclusion from the E. eugene-maraisii complex, concurring with findings of Rousseau (2012); Stewart et al. (2023) and Williamson (2021). Within the E. eugene-maraisii complex, however, although analyses were able to partially distinguish species groups in most cases, it is uncertain whether all species delimitations are justified. Based on the above analyses we give a brief discussion of each species within the E. eugene-maraisii complex.
4.2.1. E. middelburgensis
In our analyses,
E. middelburgensis showed no clustering, with samples dispersed across species groups, while showing some affinity to
E. nubimontanus and
E. dyerianus samples. Although the absence of grouping in
E. middelburgensis could be accredited to their poor amplification rate (
Table 3), these samples may be insufficiently distinct morphologically or genetically or to warrant their specific rank. It has been proposed that
E. middelburgensis is a subspecies or merely a cline of
E. eugene-maraisii since morphological characters that distinguish
E. middelburgensis from
E. eugene-maraisii are not as distinct as those distinguishing
E. dyerianus or E.
dolomiticus from
E. eugene-maraisii [
106]. Our study, however, revealed no affinity between
E. middelburgensis and
E. eugene-maraisii samples. The vast geographical distance between these two taxa relative to other members (
Figure 1) further supports this finding. Further investigation is nonetheless required to elucidate the taxonomic singularity of
E. middelburgensis.
4.2.2. E. nubimontanus
E. nubimontanus displayed large variation in our analyses, with samples mostly separating into two loosely clustered groups. This may reflect high genetic diversity of the group. The species does contain considerable morphological variation, with some variants such as “Robusta” contested to instead be forms of
E. cupidus [
85,
102,
107]. This calls into question the validity of
E. nubimontanus as a distinct species. However, there was no evidence that samples representing the same identical variants consistently grouped together. A possible explanation of this is inconsistent amplification success of
E. nubimontanus samples, or incorrect data capture of cycads by private owners. Since a number of samples used in our study was procured from garden specimens and not from habitat, we have relied on records of these plants maintained by the cycad owners which were often brief, ambiguous, or incomplete. So-called distinguishing characteristics of some of these
E. nubimontanus variants may also be exaggerated or represent morphological extremes of
E. nubimontanus. To date, previous molecular studies have been unsuccessful in grouping
E. nubimontanus individuals together with conspecifics within the complex [
76,
87]. Alternatively, these results may reflect the presence of a cryptic lineage within
E. nubimontanus, or hybridisation with extinct or extant lineages which generates variation in some populations [
108], or hybridisation with
E. cupidus, with which it historically co-occurred in habitat. Other authors have also speculated a history of hybridisation or reticulation in other South African
Encephalartos species [
76,
79].
4.2.3. E. cupidus
Depending on the dataset used,
E. cupidus samples either formed a discrete group, or a group which overlapped with another species group, such as
E. nubimontanus. In the analyses using the 100rfu dataset,
E. cupidus and
E. nubimontanus sample clusters overlapped or alternated in cluster analyses, or had shared membership to certain STRUCTURE groups. As mentioned previously, these two species historically shared distribution ranges, potentially leading to hybridisation, or slower speciation [
78,
79]. These species may also represent incompletely separated lineages, which might warrant subspecific rank [
109]. This may explain the morphological and genetic overlap between
E. nubimontanus and
E. cupidus reflected in some of our analyses. In contrast, the network analysis based on the 50rfu dataset showed
E. cupidus branching from
E. eugene-maraisii without apparent similarity to
E. nubimontanus.
4.2.4. E. eugene-maraisii
E. eugene-maraisii formed a distinctive cluster in most analyses, but was closely associated with
E. dolomiticus, which frequently formed a discrete cluster nested within the
E. eugene-maraisii cluster. This hints at
E. eugene-maraisii being the ancestral species of
E. dolomiticus. The network analysis for the 50rfu dataset also showed that species clusters of
E. cupidus,
E. dyerianus and
E. dolomiticus branched from the cluster of
E. eugene-maraisii, suggesting genetic similarity between these species. These associations with
E. eugene-maraisii, however, are not as obvious in STRUCTURE or cluster analyses.
E. dyerianus and
E. dolomiticus were originally thought to be ecotypes of
E. eugene-maraisii receiving infraspecific ranks following a morphological and anatomical study [
110], but eventually were raised to species rank in subsequent taxonomic revisions [
106,
111].
4.2.5. E. dolomiticus
Our analyses frequently grouped
E. dolomiticus samples into a discrete cluster, which was also often nested within clusters of
E. eugene-maraisii samples. This suggests that
E. dolomiticus is a subspecies of
E. eugene-maraisii, or a more recently diverged species. Assigning subspecies rank to
E. dolomiticus may, however, not be suitable, as
E. dolomiticus is restrictively adapted to very specific soil conditions [
107] as well as it comprising a single, highly isolated population [
54,
106]. This may disqualify
E. dolomiticus from being a subspecies of
E. eugene-maraisii in the context of the ecological species concept [
112]. Moreover, the species may be legitimately delimited through the Unified Species Concept [
113,
114], which is similar to the Genealogical Species Concept [
115], but whose only criteria is that populations are presently evolving independently from one another regardless of historical associations. This concept therefore acknowledges the possibility of species merging and separating through time, [
116] which may have occurred in cycad species given their ability to hybridise [
76,
79,
84,
117,
118].
4.2.6. E. dyerianus
Throughout most of the analyses
E. dyerianus samples formed a distinct cluster and did not appear to be closely associated with other species groups. This further justifies the designation of this group as a species.
E. dyerianus, moreover comprises a single population situated on a remote outcrop despite growing in soils of similar geography to other cycads in the
E. eugene-maraisii complex [
54,
106]. The cohesion of this group may indicate a long history of isolation and genetic differentiation, in contrast to
E. dolomiticus, which shows signs of having diverged from
E. eugene-maraisii. The sample representing the “levubuensis” variant (dye11Lb) also frequently grouped with members of
E. dyerianus, suggesting these taxa are a disjunct population of
E. dyerianus. “Levubuensis” specimens are reported to be almost indistinguishable from
E. dyerianus, despite their large geographical separation, but are also speculated to be an undescribed species (A. W. Frisby pers. comm.). In contrast, the sample 16Lb, which may have been recorded as a “levubuensis” sample in error, did not consistently group with any species group. It is thus uncertain to which species group this sample belongs.
4.3. Methodological Critique
The use of automated methodologies for DNA extraction and ISSR fragment detection and profiling shows great promise in genetic studies of rare species. Automated DNA extraction using the Oktopure robot Sbeadex technology proved a reliable, rapid, and consistent method for extracting cycad DNA, and may be useful method for high throughput extractions. This method proved more effective than traditional CTAB methods [
119], producing higher concentration DNA. It may, nonetheless, benefit from further optimisation to improve DNA purity and reduce contamination. The use of fluorescently tagged primers and automated detection by a genetic analyser allowed successful and rapid identification and scoring of hundreds of amplified bands whilst preventing errors introduced via manual visualisation and recording. It also generated an adequate number of bands (over the 200 bands recommended by Ng & Tan (2015) for a Φ
ST exceeding 0.1) for STRUCTURE analysis. Given the dominant nature of ISSR markers and the lower information content of these markers relative to codominant markers, having numerous bands is important to generate sufficient resolution for distinguishing taxa [
121]. Although our final analyses included only 92 samples, our final datasets contained 663 loci in 50rfu datasets and 439 loci in 100rfu datasets and 113 in 200rfu datasets. Higher allele number in this case, may prove more informative than greater sample number if increasing allele number generates greater genetic distances between taxa, leading to higher resolution [
37,
88]. Moreover, considering the high genetic similarity among cycads and their slow evolutionary rate and divergence, using noncoding DNA, such as ISSRs as a source of variation can be more informative than coding regions. Due to the slower rates of evolution of coding regions, these are more conserved than noncoding regions [
122]. Obtaining DNA from multiple sources in the genome also reduces bias introduced by using DNA from a single source, such as nuclear DNA [
22,
116].
Despite the promise of these methods, we have identified several improvements that may make future applications of these methods even more reliable and rigorous.
4.3.1. Spurious Bands and Reproducibility
One of the main challenges of ISSR analysis relates to a lack of reproducibility due to spurious bands, making transfer of results between labs difficult [
6,
120]. Issues of reproducibility can be addressed through the comparison of replicates and keeping only common bands [
49], and ignoring fainter bands which are likely spurious Ng & Tan (2015). In this study, the use of the Oktopure DNA extraction robot to do DNA extractions in bulk is also a way to reduce variability in the DNA extraction process. Prince (2015) also recommend using clean PCR products, as well as the same thermal cycler and settings for best reproducibility. In addition, removal of common contaminants in DNA extracts which inhibit PCR [
123,
124]. Cycad leaves contain high levels of polysaccharides, proteins and secondary metabolites which can co-precipitate with DNA and interfere with PCR [
124,
125]. Removal of some of these contaminants can be achieved using commercially available purification columns or the use of reagents such as polyvinylpolypyrrolidone (PVPP) to eliminate polyphenols as well as NaCl to remove polysaccharides [
126].
Since there appeared to be a species-specific link to amplification success with the various primers (
Table 3), it may be necessary to individually optimise PCR reactions for each species or increase the number of primers used. Individual primers additionally showed differing suitability for each species further justifying the use of additional primers in the study. Although not done in this study, the standardisation of DNA concentrations [
120] might have improved the success of PCR amplification of some of our samples.
4.3.2. Sampling Effort and Cost Reduction Strategies
Our study could additionally be improved by optimising sampling effort [
127]. In our study, 180 plants were sampled, but over half of these were excluded from the study due to poor amplification of ISSRs, resulting in wasted costs on reagents and sampling time (
Table S3). Improvement of DNA extraction and PCR amplification of our samples will likely offset these costs by improving the success rate of amplification and reducing the need for sampling in excess. In addition, multiplexing primers marked with two different dyes in wells in the genetic analyser is another way to reduce costs while potentially resulting in the generation of greater bands than the sum of bands in two separate wells [
50]. Another potentially important consideration is screening for epiphyte or endophyte contamination of samples [
22], which influence the banding patterns of target DNA and may exaggerate genetic diversity [
128].
With modification to the ISSR method, high throughput sequencing technologies may also be employed to sequence ISSR fragments in Multiplexed ISSR Genotype-by-sequencing (MIG-seq, [
9]). This method has been successfully employed on
Dioon Lindl [
129,
130]. As high throughput sequencing continues to reduce in cost it may become available to more modest budgets [
1,
2]. A potential setback to these methods is the requirement of high molecular weight DNA and greater methodological complexity. However, the method can be modified to address these issues [
131].
5. Conclusions
Using this automated ISSR method and range of analytical approaches, we were able to assess and partially confirm the taxonomic delimitation within the E. eugene-maraisii complex. However, for some of the species, we recommend that additional sampling is necessary, and further optimisation of DNA extraction and PCR amplification procedures is required. In addition, the use of additional primers will improve resolution and may elucidate the relationships among E. nubimontanus and E. cupidus; and the taxonomic validity of E. middelburgensis.
In addition, our study has highlighted the importance of using a variety of datasets and analytical methods to explore the signal in the data. Some datasets are more suitable than others depending on the analysis used. STRUCTURE provided robust and easy-to-visualise Bayesian analyses but required samples with high band number and needed moderate setup related to model selection and model settings. Network analysis, while also requiring samples with high band number, were easy to set up and produced an easy to visualise graphic output, where sample groups could be quickly discerned. Networks may prove extremely useful for initial visualisation of a new dataset, and in identifying problematic samples. It is also a valuable supplement to other analyses like STRUCTURE. Finally, the cluster analysis proved an effective method for grouping samples, requiring fewer bands per sample than STRUCTURE and network analyses to produce meaningful results. Careful selection of suitable similarity coefficients for the data, however, is necessary since this can greatly alter the topology of the dendrograms. Therefore, an understanding of how each similarity coefficient weighs apparent similarities in data, such as band absences, is required [
22].
Finally, our study thus demonstrates the cost effectiveness and suitability of automated ISSR fingerprinting as a rapid, simple, and cost-effective method to investigate genetic diversity and taxonomic limits in closely related and range restricted Encephalartos species, and we suggest that this approach is applicable to a wide range of taxa. The method was able to produce numerous bands with minimal setup and was sufficiently sensitive to distinguish many of the species in the E. eugene-maraisii complex and shows great potential in the application on conservation genetics and taxonomy of all taxa by scientists in developing countries.
Author Contributions
For research articles with several authors, a short paragraph specifying their individual contributions must be provided. Conceptualization, N.P.B; methodology, D.M and N.P.B; software, D.M.; validation, A.W.F., D.M. and N.P.B.; formal analysis, D.M.; investigation, D.M.; resources, A.W.F. and N.P.B..; data curation, D.M.; writing—original draft preparation, D.M.; writing—review and editing, A.W.F. and N.P.B.; visualization, D.M.; supervision, N.P.B. and A.W.F.; project administration, A.W.F. and N.P.B.; funding acquisition, D.M and N.P.B. All authors have read and agreed to the published version of the manuscript.
Figure 1.
Map of the Limpopo province of South Africa showing the approximate location of members of the Encephalartos eugene-maraisii complex in this study.
Figure 1.
Map of the Limpopo province of South Africa showing the approximate location of members of the Encephalartos eugene-maraisii complex in this study.
Figure 2.
Boxplots showing the Nanodrop readings for DNA concentration in ng/µl (a) and DNA fluorescence ratios indicating purity (b), in samples which were included in the study (blue plots) and those excluded (orange plots) due to unsuccessful PCR amplification. Means are denoted by X and medians by horizontal lines inside the boxes. Outliers are denoted by dots.
Figure 2.
Boxplots showing the Nanodrop readings for DNA concentration in ng/µl (a) and DNA fluorescence ratios indicating purity (b), in samples which were included in the study (blue plots) and those excluded (orange plots) due to unsuccessful PCR amplification. Means are denoted by X and medians by horizontal lines inside the boxes. Outliers are denoted by dots.
Figure 3.
STRUCTURE barplots showing the proportion of membership of samples assigned to K = 7 clusters within the Encephalartos eugene-maraisii complex. Results are based on ISSR fragments scored at a 50 relative fluorescence unit (rfu) cut-off value. The dataset was tested using the Standard STRUCTURE model (a) and the LOCPRIOR model (b) that accounts for known locality data prior to the run. Colours represent each of the predefined clusters to which each sample is assigned. Species are numbered 1 – 7 on the x-axis, from left to right: E. nubimontanus, E. eugene-maraisii, E. cupidus, E. middelburgensis, E. dolomiticus, E. dyerianus, E. hirsutus.
Figure 3.
STRUCTURE barplots showing the proportion of membership of samples assigned to K = 7 clusters within the Encephalartos eugene-maraisii complex. Results are based on ISSR fragments scored at a 50 relative fluorescence unit (rfu) cut-off value. The dataset was tested using the Standard STRUCTURE model (a) and the LOCPRIOR model (b) that accounts for known locality data prior to the run. Colours represent each of the predefined clusters to which each sample is assigned. Species are numbered 1 – 7 on the x-axis, from left to right: E. nubimontanus, E. eugene-maraisii, E. cupidus, E. middelburgensis, E. dolomiticus, E. dyerianus, E. hirsutus.
Figure 4.
Neighbor joining analysis of the Encephalartos eugene-maraisii complex based on ISSR markers with a minimum band intensity of 50 relative fluorescence units (rfu). Genetic distances were computed using the DICE coefficient. Band presence and absence was used to compute genetic distances using the DICE coefficient. Colour shading indicates each species group, and specimen names are represented by the first three letters of their species epithet, corresponding to
Table S3. Sample duplicates, representing material obtained from the same plant, but extracted in a different DNA extraction batch, are indicated by the coloured rectangles.
Figure 4.
Neighbor joining analysis of the Encephalartos eugene-maraisii complex based on ISSR markers with a minimum band intensity of 50 relative fluorescence units (rfu). Genetic distances were computed using the DICE coefficient. Band presence and absence was used to compute genetic distances using the DICE coefficient. Colour shading indicates each species group, and specimen names are represented by the first three letters of their species epithet, corresponding to
Table S3. Sample duplicates, representing material obtained from the same plant, but extracted in a different DNA extraction batch, are indicated by the coloured rectangles.
Figure 5.
NJ dendrogram of the Encephalartos eugene-maraisii complex based on ISSR markers at a relative fluorescence unit (rfu) cut-off of 100rfu. Band presence and absence was used to compute genetic distances using the DICE coefficient. Colour shading indicates each species group, and specimen names are represented by the first three letters of their species epithet, corresponding to Appendix Table S2. Sample duplicates, representing material obtained from the same plant, but extracted in a different DNA extraction batch, are indicated by the coloured rectangles.
Figure 5.
NJ dendrogram of the Encephalartos eugene-maraisii complex based on ISSR markers at a relative fluorescence unit (rfu) cut-off of 100rfu. Band presence and absence was used to compute genetic distances using the DICE coefficient. Colour shading indicates each species group, and specimen names are represented by the first three letters of their species epithet, corresponding to Appendix Table S2. Sample duplicates, representing material obtained from the same plant, but extracted in a different DNA extraction batch, are indicated by the coloured rectangles.
Figure 6.
Median joining network of the Encephalartos eugene-maraisii complex based on ISSR markers with a minimum band intensity of 50 relative fluorescent units. Colours denote the species of each sample in this study.
Figure 6.
Median joining network of the Encephalartos eugene-maraisii complex based on ISSR markers with a minimum band intensity of 50 relative fluorescent units. Colours denote the species of each sample in this study.
Rank |
Country |
Number of Publications |
1 |
India (8) |
1591 |
2 |
China (4) |
1493 |
3 |
United States of America (10) |
625 |
4 |
Iran |
504 |
5 |
Brazil (1) |
447 |
6 |
Egypt |
384 |
7 |
Turkey |
275 |
8 |
Italy |
246 |
9 |
Saudi Arabia |
244 |
10 |
Russian Federation |
225 |
11 |
Poland |
186 |
12 |
Spain |
181 |
13 |
Germany |
179 |
14 |
Japan |
164 |
15 |
Mexico (5) |
153 |
16 |
United Kingdom |
150 |
17 |
France |
140 |
18 |
Canada |
133 |
19 |
Australia (6) |
129 |
20 |
Portugal |
118 |
21 |
Malaysia (15) |
109 |
22 |
Thailand (20) |
102 |
23 |
Indonesia (2) |
94 |
24 |
South Korea |
86 |
25 |
Argentina |
83 |
26 |
Tunisia |
76 |
27 |
Greece |
64 |
28 |
Pakistan |
59 |
29 |
South Africa (19) |
54 |
30 |
Czech Republic |
52 |
21 |
Malaysia (15) |
109 |
22 |
Thailand (20) |
102 |
23 |
Indonesia (2) |
94 |
24 |
South Korea |
86 |
25 |
Argentina |
83 |
26 |
Tunisia |
76 |
27 |
Greece |
64 |
28 |
Pakistan |
59 |
29 |
South Africa (19) |
54 |
30 |
Czech Republic |
52 |
Table 2.
ISSR primers screened in this study. The primers are manufactured by Inqaba Biotechnical Industries.
Table 2.
ISSR primers screened in this study. The primers are manufactured by Inqaba Biotechnical Industries.
ISSR Primer Name |
5’ Fluorescent Marker |
Sequence |
Manny |
6-FAM |
CACCACCACCACRC |
812 |
HEX |
GAGAGAGAGAGAGAGAA |
Mao |
TET |
CTCCTCCTCCTCRC |
Omar |
HEX |
GAGGAGGAGGAGRC |
864 |
6-FAM |
ATGATGATGATGATGATG |
856 |
TET |
ACACACACACACACACYA |
Table 3.
Comparison of twelve datasets for different ISSR primers at three relative fluorescence unit (rfu) cut-offs. Datasets were generated using GeneMapper Software Version 5 (Applied Biosystems, USA) and based on electropherogram outputs obtained from the ABI3130 genetic analyser. Bands were scored “1” for presence if above the -rfu threshold and “0” for absence if below this threshold.
Table 3.
Comparison of twelve datasets for different ISSR primers at three relative fluorescence unit (rfu) cut-offs. Datasets were generated using GeneMapper Software Version 5 (Applied Biosystems, USA) and based on electropherogram outputs obtained from the ABI3130 genetic analyser. Bands were scored “1” for presence if above the -rfu threshold and “0” for absence if below this threshold.
Primer |
Minimum Fluorescence (rfu) |
Total Uniquely Sized Bands |
Mean Bands per Sample |
Private Bands |
ISSR Mao (TET) |
50 |
111 |
12 |
32 |
ISSR Mao (TET) |
100 |
83 |
7 |
22 |
ISSR Mao (TET) |
200 |
30 |
4 |
6 |
ISSR 864 (6-FAM) |
50 |
459 |
44 |
68 |
ISSR 864 (6-FAM) |
100 |
327 |
19 |
32 |
ISSR 864 (6-FAM) |
200 |
73 |
12 |
28 |
ISSR 856 (TET) |
50 |
93 |
11 |
21 |
ISSR 856 (TET) |
100 |
29 |
3 |
5 |
ISSR 856 (TET) |
200 |
10 |
1 |
1 |
Combined |
50 |
663 |
22 |
121 |
Combined |
100 |
439 |
10 |
59 |
Combined |
200 |
113 |
5 |
35 |
Table 4.
An analysis of DNA purity as determined by Nanodrop as relating to the success of DNA extractions and subsequent PCR amplification success. Percentage successful amplification was calculated based on the number of PCR amplifications which produced clear, distinguishable bands as a percentage of the total PCR amplifications performed with the three selected primers. DNA extracts of samples which did not form bands on agarose gel for any of the primers and were subsequently omitted from the study are shown for the first and second batches of automated DNA extraction.
Table 4.
An analysis of DNA purity as determined by Nanodrop as relating to the success of DNA extractions and subsequent PCR amplification success. Percentage successful amplification was calculated based on the number of PCR amplifications which produced clear, distinguishable bands as a percentage of the total PCR amplifications performed with the three selected primers. DNA extracts of samples which did not form bands on agarose gel for any of the primers and were subsequently omitted from the study are shown for the first and second batches of automated DNA extraction.
Species |
Mean DNA Concentration (ng/μl) |
Mean 260/280 |
Mean 260/230 |
Number of Individuals |
Percentage Successful Amplification |
E. eugene-maraisii |
302.6 |
1.64 |
0.5 |
35 |
55.5% |
E. nubimontanus |
373.1 |
1.34 |
0.58 |
48 |
54.1% |
E. hirsutus |
328.2 |
1.65 |
0.53 |
2 |
100% |
E. dyerianus |
286.5 |
1.4 |
0.6 |
27 |
56% |
E. middelburgensis |
294.4 |
1.45 |
0.55 |
23 |
24.6% |
E. cupidus |
491.8 |
1.3 |
0.62 |
46 |
36.2% |
E. dolomiticus |
344.3 |
1.6 |
0.49 |
13 |
60% |
Omitted samples Batch 1 |
350.3 |
1.61 |
0.51 |
46 |
0% |
Omitted samples Batch 2 |
610.4 |
1.20 |
0.62 |
23 |
0% |
Table 5.
Statistics used in calculating Tajima’s D statistic for genetic isolation between species of the Encephalartos eugene-maraisii complex computed in PopART software, based on ISSR fragments.
Table 5.
Statistics used in calculating Tajima’s D statistic for genetic isolation between species of the Encephalartos eugene-maraisii complex computed in PopART software, based on ISSR fragments.
|
-rfu Cut-Off Dataset |
|
50 |
100 |
200 |
Nucleotide diversity (π) |
0.111362 |
0.06931 |
0.154567 |
Segregating sites |
474 |
207 |
105 |
Tajima's D statistic |
-0.70597 |
-0.84998 |
-0.50775 |
Significance (p) |
0.743371 |
0.79021 |
0.673339 |
Table 6.
AMOVA of taxa in the Encephalartos eugene-maraisii complex computed in PopART Software, based on ISSR fragments.
Table 6.
AMOVA of taxa in the Encephalartos eugene-maraisii complex computed in PopART Software, based on ISSR fragments.
|
-rfu Cut-Off Dataset |
|
50 |
100 |
200 |
Variation among populations (%) |
30.77251 |
34.35135 |
27.36688 |
Variation within populations (%) |
69.22749 |
65.64865 |
72.63312 |
Fixation index (1000 permutations) (ΦST) |
0.30773 |
0.34351 |
0.27367 |
Significance (1000 permutations) |
< 0.001 |
< 0.001 |
< 0.001 |
Table 7.
Results from the Evanno method generated from STRUCTURE Harvester used to determine the optimal value of K. Five independent runs of each model were run in STRUCTURE software using a 1000 iteration burn in and 10 000 MCMC iterations. The table shows entries of the runs with the top three delta K values for each STRUCTURE model used. It is assumed that runs with the largest Delta K indicate the optimal K value.
Table 7.
Results from the Evanno method generated from STRUCTURE Harvester used to determine the optimal value of K. Five independent runs of each model were run in STRUCTURE software using a 1000 iteration burn in and 10 000 MCMC iterations. The table shows entries of the runs with the top three delta K values for each STRUCTURE model used. It is assumed that runs with the largest Delta K indicate the optimal K value.
STRUCTURE Model |
K |
Reps |
Mean LnP(K) |
Stdev LnP(K) |
Ln'(K) |
|Ln''(K)| |
Delta K |
50rfu dataset |
|
|
|
|
|
|
|
Combined |
3 |
10 |
-10301.7 |
18.22118 |
703.47 |
20173.2 |
1107.129 |
|
4 |
10 |
-29771.5 |
44953.17 |
-19469.7 |
34093.06 |
0.758413 |
|
7 |
10 |
-99361.9 |
134082.7 |
-10976.7 |
72995.87 |
0.544409 |
Standard |
3 |
5 |
-10308.8 |
21.45484 |
669.58 |
36468.1 |
1699.761 |
|
2 |
5 |
-10978.4 |
56.06913 |
755.44 |
85.86 |
1.531324 |
|
4 |
5 |
-46107.3 |
62036.56 |
-35798.5 |
51042.94 |
0.822788 |
LOCPRIOR |
3 |
5 |
-12050.2 |
7.81812 |
824 |
3948 |
504.9807 |
|
2 |
5 |
-12874.2 |
8.16566 |
1958.34 |
1134.34 |
138.9159 |
|
9 |
5 |
-131320 |
28079.31 |
-42412 |
137610.06 |
4.900763 |
100rfu dataset |
|
|
|
Combined |
2 |
10 |
-4358.63 |
41.68688 |
456.51 |
285.56 |
6.850117 |
|
4 |
10 |
-5164.93 |
1265.555 |
-977.25 |
6582.16 |
5.201008 |
|
3 |
10 |
-4187.68 |
282.7365 |
170.95 |
1148.2 |
4.061024 |
Standard |
3 |
5 |
-4135.28 |
115.5629 |
222.28 |
1528.22 |
13.22414 |
|
4 |
5 |
-5441.22 |
1378.755 |
-1305.94 |
8373.18 |
6.073002 |
|
2 |
5 |
-4357.56 |
54.888 |
458.74 |
236.46 |
4.308045 |
LOCPRIOR |
2 |
5 |
-4359.7 |
29.90794 |
454.28 |
334.66 |
11.18967 |
|
4 |
5 |
-4888.64 |
1229.577 |
-648.56 |
4791.14 |
3.896577 |
|
7 |
5 |
-12121.9 |
4413.491 |
429.48 |
17051.6 |
3.863518 |
200rfu dataset |
|
|
|
|
|
|
|
Combined |
2 |
20 |
-2527.3 |
98.08819 |
188.115 |
491.245 |
5.008197 |
|
9 |
20 |
-6596.87 |
3560.598 |
1204.67 |
6382.91 |
1.792651 |
|
8 |
20 |
-7801.54 |
4452.608 |
-1272.98 |
2477.645 |
0.556448 |
Standard |
2 |
10 |
-2508.19 |
45.95141 |
207.62 |
232.48 |
5.059257 |
|
3 |
10 |
-2533.05 |
197.4806 |
-24.86 |
363.12 |
1.838763 |
|
7 |
10 |
-4307.97 |
1899.158 |
218.04 |
2850.52 |
1.500939 |
LOCPRIOR |
2 |
10 |
-2508.19 |
45.95141 |
207.62 |
232.48 |
5.059257 |
|
3 |
10 |
-2533.05 |
197.4806 |
-24.86 |
363.12 |
1.838763 |
|
7 |
10 |
-4307.97 |
1899.158 |
218.04 |
2850.52 |
1.500939 |