4.1. Markers associated with feed quality traits
Buffel grass is an important forage grass in the tropical and subtropical regions of the world [
24,
25]. Substantial variation in agronomic and nutritional quality traits was observed in the buffel grass accessions which shows the rich genetic variation embedded in the collection from which to select lines with superior performance. Genotypes × season interaction was also significant for all traits indicating that multiyear evaluation of buffel grass is essential to determine consistent performance of the genotypes. However, the maximum biomass yield recorded in the current report is less than the biomass yield reported elsewhere [
25,
28,
33]. The relatively lower biomass yield observed at Bishoftu could be related to the environmental conditions and difference in management practices. On the other hand, the range of CP content of the studied accessions was wider than what has been reported elsewhere for the grass [
28,
50]. Based on genome-wide DArTSeq markers, the collection was clustered into eight clusters [
9]. The accessions in clusters I, II and III showed low biomass yield but a relatively higher feed quality (CP, TDN and DMI) than the rest of the clusters. Cluster IV had the highest biomass yield and the tallest plants compared to the other clusters. Similarly, Jorge and colleagues [
35] also studied 68 accessions and classified them based on the robustness of the plant, flowering characters and growth forms. Accordingly, some of the accessions with the highest biomass yield and tallest plants belong to the most robust and leafiness cluster group while accessions with the lowest biomass yield belong to the cluster with short leaves and thin stems.
The present study also revealed different levels of variability and heritability (H
2) among genotypes. Biomass yield and plant height recorded the highest value for PCV and GCV indicating the presence of high genetic variability for the traits. The PCV value for biomass yield was equivalent to the corresponding GCV value while the PCV value for plant height was close to GCV value. The heritability estimates were high for both biomass yield and plant height. This shows the substantial contribution of genetic factors to the observed performance for both traits. Thus, directional selection might be effective to improve these two traits. On the other hand, NDF, TDN, CP, ADF and DMI showed low PCV, GCV and H
2 estimates indicating low genetic variability. PCV values for feed quality traits were greater than the corresponding GCV values, indicating the significant effect of environmental factors on the expression of these traits. This is in line with the observed low heritability estimates for the feed quality traits. In general, the evaluated accessions showed significant variation in performance. Hence, given the observed genetic and phenotypic performance variation in the collection [
9,
35,
51], there is a potential improvement opportunity in the buffel grass germplasm to develop high yielding climate resilient varieties.
4.3. Marker trait associations in buffel grass
In the present study, we used a buffelgrass germplasm collection in the forage genebank at ILRI and conducted GWAS to identify marker trait associations. Several marker trait associations were identified. Using the combined data from the two seasons, three SNP and eight SilicoDArT markers were found to be associated with biomass yield. A total of nine markers (six SilicoDArT and three SNP) were found to be associated with biomass yield in the 2016 growing season. One of the SNP and two of the SilicoDArT markers were detected both using the 2016 season and the combined data. In the 2015 growing season, one SilicoDArT marker was associated with biomass yield. One SilicoDArT marker was associated with plant height in 2015. No marker was associated with plant height using the 2016 season and the combined data.
Using the combined data, four SilicoDArT markers were found to be associated with ADF and TDN, while no marker was found to be associated with the other feed quality traits (CP, NDF and DMI). One of the four markers was also detected using individual season data and it was associated with CP in 2015 and TDN in 2016. A total of eight SilicoDArT markers (four in 2015 associated with CP, two in 2016 associated with CP, three in 2016 associated with TDN) were found associated with feed quality traits.
Using the combined data, a total 42 SNP markers were associated with feed quality traits, of which four were also detected using the 2016 season data. Of the four markers detected using both combined and 2016 season data, two were associated with the same trait (one with ADF and the other with TDN). The other two markers were associated with different traits depending on the dataset. Seven SNP markers were associated with DMI using the combined data while no marker was found to be associated with the trait using the individual season data. Thirteen SNP markers were found to be associated with feed quality using data from the 2016 growing season. Of these markers, four were associated with TDN, two with CP and seven with ADF. The different marker trait associations identified between the two growing seasons (2015 and 2016) could be related to the difference in weather conditions (
Supplementary Table S11). For example, the average monthly rainfall of the location during the months of July to September was 117 mm in 2015 and 142 mm in 2016. In addition, in 2015, the minimum and maximum daily temperature during July to August was 12°C and 26°C while it was 13°C and 28°C, respectively, during the same months in 2016. The variation in growing conditions would affect the performance of the genotypes and result in variation in the marker trait associations for the different years. Another reason could be that the plants were well established during the second season and therefore more able to reach a performance towards the crops genetic potential.
4.4. Genome wide distribution and co-localization of the marker trait associations
Except for a few studies with conventional molecular markers[
52], genomic studies are limited in buffel grass. A reference genome has not been developed to date. The lack of its own reference genome has hindered mapping and selection of genome wide representative markers for further molecular studies. As a result, the reference genome of
Setaria italica [
39] was used to map the generated markers. However, only a small percentage of the total markers were successfully mapped [
9]. Despite this challenge, we conducted a GWAS using the mapped markers and identified several marker trait associations with R
2 values ranging from 0.138 to 0.236. The identified marker trait associations were distributed across the different chromosomes of the
Setaria italica genome (
Figure 8). On Chr1, three SilicoDArT markers (one for CP and two for biomass yield) and four SNPs (one associated with CP and TDN, and one each for biomass yield, CP and ADF) were identified. The SNP associated with ADF was detected using the 2016 season while the SNPs with biomass yield, CP and TDN were identified using the combined data. Three SilicoDArT markers (one for CP and three for biomass yield using the 2016 season data) and two SNP markers (one each for CP and NDF using combined data) were identified on Chr2. Five SNP markers (two associated with biomass yield, one with TDN, one with both NDF and TDN and one with both CP and TDN) and two SilicoDArT markers (one each for yield and TDN) were located on Chr3. On Chr4, one SilicoDArT marker associated with biomass yield and one SNP marker associated with multiple traits (ADF, TDN and DMI) were identified using the combined data. No marker on this chromosome was found to be associated with these traits using individual season data.
On Chr5, 13 markers were associated with different traits. One SilicoDArT marker was associated with both CP (2015 season data) and TDN (2016 season data) while it was associated with ADF and TDN using the combined data. Another two SilicoDArT markers associated with CP (2016 season) and TDN (combined data) were also located on this chromosome. In addition to the SilicoDArT markers, ten SNP markers associated with different traits (one with biomass yield, three with TDN, three with DMI, three with CP, two with ADF and two with NDF) were also found on this chromosome. Four of these SNP markers were associated with two different traits.
On Chr6, there were two SilicoDArT markers (one each associated with CP and biomass yield) and six SNP markers (two with TDN, two with ADF, one with both TDN and ADF, and one with TDN and DMI) associated with different traits. One of the SNP markers was associated with three feed quality traits (NDF, TDN and DMI) while one was associated with both ADF and TDN. Six SNP and two SilicoDArT markers associated with different traits were located on Chr7. One of these markers was associated with three feed quality traits (CP, ADF and TDN) while the other three markers were associated with two different traits. Three SilicoDArT markers (one each for plant height, biomass yield, and TDN) and one SNP marker associated with CP were located on Chr8. A total of 23 markers associated with traits (18 SNP and 5 SilicoDArT) were located on Chr9. Of these markers, nine were associated with CP, five with ADF, two with NDF, one with biomass yield and five with TDN. Among the SilicoDArT markers, one is associated with both biomass yield and TDN, two with CP and three with biomass yield. One of the SNP markers was associated with three feed quality traits (CP, ADF and TDN) while four SNP markers were associated with two of the traits.
In a few cases, a single marker was associated with two and three traits or markers associated with two different traits were closely located on the same chromosome. Markers associated with three traits were found on Chr4, Chr5, Chr6, Chr7 and Chr9. The markers on Chr4 and Chr6 were associated with ADF, TDN and DMI while the markers on Chr5, Chr7 and Chr9 were associated CP, ADF and TDN. In addition, several markers associated with two traits were also found on Chr1, Chr3, Chr5, Chr6, Chr7 and Chr9. For example, a SilicoDArT marker on Chr5 was associated with both CP and TDN while another SilicoDArT marker on Chr9 was associated with both biomass yield and TDN. Closely located marker trait associations were also found on six of the nine chromosomes. On Chr1, a SilicoDArT marker associated with CP and a SNP marker associated with biomass yield are located at 398,585 bp from each other. Similarly, on Chr8, markers associated with plant height and biomass yield have a physical distance of 565,288 bp from each other. Among the markers on Chr9, a SNP marker associated with CP and a SilicoDArT marker associated with biomass yield are located at 501,486 bp from each other. Other closely located marker trait associations are also found on Chr1 (biomass yield and CP/TDN), Chr3 (ADF/TDN and NDF), Chr5 (TDN and NDF/DMI), Chr6 (NDF and ADF/TDN), Chr7 (CP and CP/ADF/TDN), and Chr9 (biomass yield and CP, CP and ADF, NDF and CP/TDN). In summary, a total of 78 marker-trait associations (one based on both individual growing season and combined data, 47 based on combined data only, 21 based on individual growing season data only and 9 based on both combined and 2016 growing season data) were identified in this study. The generated information on the genome distribution of the marker trait associations will be a useful resource for future improvement programs in this important tropical forage. Furthermore, an additional study is required to validate the associations and co-localization of the identified markers. In line with this suggestion, it is very important to develop a buffel grass reference genome to facilitate genomic studies and the development of markers for efficient marker-assisted selection/breeding. Developing a species specific reference genome will increase the number of mappable markers and thereby improve the discovery and accuracy of the marker trait associations in this drought tolerant tropical forage.
4.5. Marker trait association in functional putative genomic regions
Some of the identified marker trait associations were in genomic regions related to key enzymes and proteins involved in different biochemical reactions and processes in plants. Among the identified SNP markers associated with biomass yield, one is located on Chr1 in the genomic region linked to a gene encoding a Phenylalanine ammonia-lyase (PAL)-like protein. PAL catalyzes the deamination of phenylalanine to produce trans-cinnamic acid, a precursor of lignins, flavanoids, and coumarins and it is a key enzyme that induces the synthesis of salicylic acid that causes systemic resistance in many plants [
53,
54]. A recent study has shown that PAL-knockdown plants in the model grass
Brachypodium distachyon have exhibited delayed development, reduced root growth as well as increased susceptibility to diseases [
55]. Another marker associated with biomass yield is located on Chr3 in the region related to a gene encoding a U-box domain-containing protein 1. This protein is in the family of ubiquitin ligase (E3) enzymes that are involved in various biological processes and in stress response in plants[
56]. Similarly, the SilicoDArT marker associated with plant height is located on Chr8 in the genomic region harboring a gene annotated as a
Setaria italica ankyrin-1 protein. This protein family is conserved in plants and involved in biochemical processes in response to biotic and abiotic stresses [
57,
58,
59].
Several markers were found to be associated with feed quality traits. These markers were distributed over the different chromosomes of the
Setaria italica genome. Some of the marker trait associations are located in the genomic regions linked to different biophysiological processes in plants. One of the marker trait (CP) associations on Chr2 is close to a gene encoding a E3 ubiquitin-protein ligase RGLG1-like in
Setaria viridis. E3 ubiquitin-protein ligase is a family of proteins that catalyse the ubiquitination of protein substrates for targeted degradation[
60] and have been known as an important regulator of drought stress response in plants [
61]. A SilicoDArT marker associated with TDN (on Chr8) is close to a gene encoding a predicted
Setaria italica chlorophyll a-b binding protein CP26, chloroplastic. This protein is conserved in plants and green algae and plays a key role in absorbing and transferring sunlight energy into chemical energy [
62]. Both E3 ubiquitin-protein ligases and chlorophyll a-b binding proteins are involved in many other biophysiological processes that contribute to plant growth and development.
Among the SNP markers associated with feed quality traits, the marker associated with TDN (on Chr6) is close to genes encoding a tryptophan decarboxylase 1-like and aromatic-L-amino-acid decarboxylase in grass which are involved in many biochemical reactions contributing to the formation of many metabolites involved in biotic and abiotic stress defence in plants [
63,
64]. A marker associated with CP (on Chr9) is located in the genomic region containing a gene encoding a pentatricopeptide repeat (PPR)-containing protein. PPR proteins are one of the largest nuclear-encoded protein families in higher plants and interact with RNA to affect gene expressions necessary for organelle development [
65]. On Chr9, another SNP marker associated with ADF is found in the genomic region harbouring a gene encoding a Detoxification 40-like protein, which is believed to play a role in response to stresses in plants (e.g., detoxification of a heavy metal Cd(2+) in rice) [
66]. In general, marker trait associations in genomic regions containing genes linked to important enzymes and proteins were identified. This result could be used as a starting point for a further study to elucidate genomic regions with genes controlling important traits such as drought tolerance, disease resistance and feed quality traits.