4.1. Diversity statistics and allele frequency
In the present study, the effects of four cycles of recurrent selection on simple se-quence repeat (SSR) marker allele frequencies and population structure were examined in the Maksimir 3 Synthetic (M3S) maize population. The variability of the C0 population, in terms of mean number of alleles per locus and expected heterozygosity, was in the order of magnitude of starting population of other RS programs [
5,
7,
9,
14,
19].
The mean number of alleles per locus and mean expected heterozygosity did not change significantly in the M3S population after four cycles of RS. Similarly, in the study of Wisser et al. [
40], neither of the two diversity measures, revealed by SSR markers, changed significantly after four cycles of RS for quantitative disease resistance in a complex maize population from CIMMYT. Kolawole et al. [
17] and Wisser et al. [
40] reported that the changes in different measures of SNP diversity were either small or negligible in maize populations under recurrent selection. In the study by Daas et al. [
41], the genetic diversity of the two maize populations also did not change significantly after two cycles of genomic selection. On the contrary, Labate et al. [
18] and Hinze et al. [
19] observed significant decreases in the mean number of alleles per locus and expected heterozygosity in the BSSS and BSCB1 populations after 12 and 15 cycles of reciprocal RS, respectively. A decrease in marker diversity (in terms of number of polymorphic loci, mean number of alleles per locus, expected heterozygosity, or observed heterozygosity) within maize populations, that underwent various numbers of RS cycles, has also been reported by several previous studies [
7,
14,
20,
21,
22,
42,
43,
44,
45,
46,
47].
In the present study, some alleles found in the base population (C0) were absent from subsequent cycle populations, while some alleles absent from the base population were detected in one or more subsequent cycle populations. Most of the missing alleles were generally found at low frequencies (less than 0.10), similarly like in the study of McLean-Rodríguez et al. [
48], where most of the lost or gained alleles over time in 13 Mexican landraces had rare or low frequency. Such rare alleles may not have been detected in the particular cycle population in the present study because of the relatively small sample size (32 plants per cycle population genotyped), which was also the case in the study by McLean-Rodriguez et al. [
48], who sampled only 10 plants per population. Nevertheless, our sample size is similar to the sample size of 30 plants used for SSR analysis of the BSSS and BSCB1 populations studied by Hinze et al. [
19]. In an earlier study, Labate et al. [
18] genotyped 100 individuals of the same two maize populations using RFLP markers and reported higher estimates of the average number of alleles per locus, expected heterozygosity, heterozygous plants, and number of unique alleles, which reflects not only the differences between the two types of markers used in the two studies (RFLPs versus SSRs), but also the power of larger samples in detecting less frequent alleles [
19]. Although the sample size of 32 genotyped plants per population in the present study was relatively small, it is comparable to sample sizes reported in some previous studies in maize using SSR markers [
17,
18] as well as SNP markers [
19,
20,
48], which ranged from 10 to 36 individuals per population. The possible role of pollen or seed contamination during the development of the M3S cycle populations can also not be ruled out, however.
In improved M3S cycle populations as compared to C0, the mean allele frequency did not change, although slightly higher proportions of alleles with low and with high frequencies were found. Similar changes of allele distribution after various cycles of RS in maize have been earlier found by Labate et al. [
4], Pinto et al. [
5], and Šarčević et al. [
6], whereas Kolawole et al. [
17] reported the opposite, a decrease of proportion of alleles at both low and high frequencies and an increase of those at intermediate frequencies.
Changes in allele frequencies from cycle to cycle as well as after four cycles of selection in M3S as determined by Waples test were mainly attributable to the effects of random genetic drift. Similar results were reported in previous studies examining changes in allele frequencies in maize populations undergoing RS [
2,
4,
5,
43]. Assuming the value of Ne=N, the Waples test identified six (16%) and three (8%) nonneutral loci after four cycles of selection from C0 to C4N0 and C4N150, respectively (
Table 3). The number of nonneutral loci in single cycles varied between six (16%) as found between C2 and C3S1, to 14 (37%) as found between C2 and C3FS. Labate et al. [
4] observed 17% nonneutral loci in BSSS(R) and BSCB1(R) populations after 12 cycles of reciprocal RS. Pinto et al. [
5] reported significant allele frequency changes due to selection at 13% and 7% of SSR loci in two tropical maize synthetics that underwent a single cycle of high intensity reciprocal RS. Falke et al. [
2] detected 20.13% nonneutral loci in one and 12.87% in a second biparental maize population after four and seven cycles of intrapopulation RS, respectively. In these studies, loci with significant allele frequency changes due to selection were not confined to particular chromosomes or genomic regions but were dispersed over the whole genome. In our study, selectively nonneutral loci were found on all ten maize chromosomes, as well, but their number and chromosomal position varied among cycles of selection. The occurrence of nonneutral loci was inconsistent among the four cycles of selection but discrepancies were also observed between neutrality status of loci after four cycles of selection and their neutrality status across individual cycles of selection (
Table S2). Even in cases where a particular marker locus was recognized as nonneutral over several cycles of selection in the M3S population, there was inconsistency in the neutrality status of different alleles at these loci. In addition to selfed progeny RS (used through all four cycles of selection), FS RS was implemented in the third cycle and, in addition to yield, other traits such as disease resistance in the first and N use efficiency in the fourth selection cycle were considered. These factors might have contributed to selection pressure on different QTLs during the four selection cycles in the M3S population. According to Wisser et al. [
13] the most important drawback to selection mapping of an individual trait arises if selection is exerted for multiple traits, which is typically the case in breeding populations used for production. In such cases, selection mapping will not distinguish loci responding to a particular selection pressure.
It has also been shown that the effect of QTLs on trait values can vary among environments [
49,
50], thus causing significant QTL × environment interaction (QEI). Because selection of progenies for recombination in different cycles of RS in the M3S population were based on data collected in different sets of environments, QEI may have influenced the inconsistency in results of neutrality tests between individual cycles of selection. The reason for this, beside QEI, could be the fact that more than two alleles (up to seven) were found in the population at these marker loci. Thus, it can be assumed that more than one marker allele per locus was initially (in the base population) linked to favorable as well as unfavorable alleles at a particular QTL, leading to random changes of frequencies within the two groups of marker alleles as a result of selection pressure at that QTL.
The C1 cycle population of M3S was created by intermating the highest yielding S2 progenies after stringent selection for disease resistance among the preceding S1 progenies. The applied two-stage selection method resulted in a population with the highest mean Φ
ST value (
Table 4) between a single cycle population and all other cycle populations (mean Φ
ST= 0.067). The selection for two generally negatively correlated traits might increase genetic distance of C1 to other cycle populations developed through selection for yield only. The observed differentiation of C1 based upon molecular data was also observed on the phenotypic level reported by Bukan et al. [
30] (decreased yield of C1 at both N fertilization levels investigated). Yield decreases after primary selection for pest resistance was also reported by Devey and Russell [
51] and Klenke et al. [
52]. Butrón et al. [
43] found a significant linear trend for departure from the random genetic drift model for some allelic versions of the two SSR markers, umc1329 and phi076, in their study of molecular changes in the maize composite during selection for resistance to pink stem borer. In the C1 population of M3S, a significant nonneutral SSR marker was also phi076. In the third cycle of selection, a high difference in the number of nonneutral markers was observed between the S1 and FS methods of selection (14 versus six from C2 to C3FS, and from C2 to C3S1, respectively). The higher number of nonneutral markers found for the FS method is in accordance with the higher yield and disease resistance observed for C3FS in comparison to C3S1 [
29]. The pairwise Φ
ST values between C3S1 and C3FS (
Table 4) showed divergence of the two populations from each other, confirming different effects of the two methods of selection applied in the third cycle. In the fourth cycle of selection, we observed higher number of nonneutral SSR loci in the C4N150 than in the C4N0 (nine vs. seven from C3FS to C4N150 and from C3FS to C4N0, respectively). Coque and Gallais [
12] also found more SSR loci to be under selection in high N fertility environments. The same authors found that the two genomic regions responding to selection were common to both high N and low N conditions, which, according to them, corroborates the observation of Bertin and Gallais [
53] that grain yield QTLs detected in low N conditions were very often a subset of QTLs detected in high N conditions, but probably differentially expressed. Three SSR markers used by Coque and Gallais [
12] were located in genomic regions found to be associated with grain yield, N uptake, and kernel number under both high and low N conditions (bnlg1643); grain yield and kernel weight under low N conditions (umc1653); and N utilization efficiency under both high and low N conditions (bnlg1402). Those three SSR markers share the same bin location (1.08, 6.07 and 9.02, respectively) as the three selectively nonneutral SSR markers (dupssr12, phi123 and umc1033) in the fourth cycle of selection of the present study, which was conducted under contrasting N fertilization regimes. The C4N0 cycle population, besides exhibiting possible adaptation to low N conditions, also exhibited a significant reduction of anthesis-silking interval (ASI) in comparison to earlier cycle populations [
30]. Two SSR loci which were selectively nonneutral in C4N0 (dupssr12 and phi438301) had the same bin location (1.08 and 4.05, respectively) as RFLP and SSR markers previously found to be associated with QTL affecting ASI in diverse sets of environments [
54]. Recent studies [
55,
56] also reported that significant SNP bases and QTLs for ASI-delay due to drought or high-density stress were located on chromosomes 1 and 4.
4.3. Linkage disequilibrium (LD)
The LD test revealed that the M3S population, across cycles of selection, was essentially in linkage equilibrium with the number of significant LD tests varying between only one (0.14%) in C3S1 to 13 (1.85%) in C1. For the three pairs of loci found to be in LD in the base population (C0), we assumed to originate either from parental LD or that they were created during population maintenance by chain sib-mating. In all but one case of observed LD in improved cycle populations (from C1 to C4), the instances of LD were not found in the C0 population and must have been generated over the course of the RS program. The total number of pairs of markers in LD generally increased with selection, which is consistent with the results reported for other populations improved through RS [
7,
14,
17,
43]. Theoretically, LD in a population can arise from intermixture of populations with different allele frequencies, by chance in small populations (random genetic drift), from selection favoring one combination of alleles over another (epistatic selection), or assortative mating [
3,
58]. On the other hand, hitchhiking may lead to an increase but also a decrease of LD between two neutral loci linked to a locus experiencing positive directional selection, depending on the position of that locus relative to two neutral loci [
59]. In several previous studies, the LD generated during the course of recurrent selection in maize synthetic populations was suggested to result mainly from genetic drift [
60], from natural selection for epistatic effects [
57], or from selection for epistatic effects [
7,
9,
14,
57]. All above mentioned evolutionary forces could be involved in creating LD between loci in M3S population as well. In a single cycle of selection, it is expected that genetic drift would generate new LD between different loci regardless of whether they are linked or unlinked. According to the Equation 1 (given in the chapter Materials and Methods), generation of drift-related LD for each single cycle of selection is in favor of unlinked loci with the probability of the two randomly selected loci being linked of only 0.08. However, due to positive correlations between the rate of decay of LD and the recombination rate between the two loci [
3], we can assume that the rate of LD decay in the M3S population due to intercrossing of selected progenies and seed multiplication of cycle populations was lower for linked than for unlinked loci. This can possibly explain the observed surplus of LD pairs including linked loci in the present study. Selection for favorable epistatic interactions may have also be involved in generating LD in the M3S population because of observed overrepresentation of nonneutral pairs among pairs of loci detected to be in significant LD (based on Equation 2 given in the chapter Materials and Methods).