Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Nov 30;10(12):3372.
doi: 10.3390/cells10123372.

Strategies to Increase Prediction Accuracy in Genomic Selection of Complex Traits in Alfalfa (Medicago sativa L.)

Affiliations
Review

Strategies to Increase Prediction Accuracy in Genomic Selection of Complex Traits in Alfalfa (Medicago sativa L.)

Cesar A Medina et al. Cells. .

Abstract

Agronomic traits such as biomass yield and abiotic stress tolerance are genetically complex and challenging to improve through conventional breeding approaches. Genomic selection (GS) is an alternative approach in which genome-wide markers are used to determine the genomic estimated breeding value (GEBV) of individuals in a population. In alfalfa (Medicago sativa L.), previous results indicated that low to moderate prediction accuracy values (<70%) were obtained in complex traits, such as yield and abiotic stress resistance. There is a need to increase the prediction value in order to employ GS in breeding programs. In this paper we reviewed different statistic models and their applications in polyploid crops, such as alfalfa and potato. Specifically, we used empirical data affiliated with alfalfa yield under salt stress to investigate approaches that use DNA marker importance values derived from machine learning models, and genome-wide association studies (GWAS) of marker-trait association scores based on different GWASpoly models, in weighted GBLUP analyses. This approach increased prediction accuracies from 50% to more than 80% for alfalfa yield under salt stress. Finally, we expended the weighted GBLUP approach to potato and analyzed 13 phenotypic traits and obtained similar results. This is the first report on alfalfa to use variable importance and GWAS-assisted approaches to increase the prediction accuracy of GS, thus helping to select superior alfalfa lines based on their GEBVs.

Keywords: Medicago sativa; WGBLUP; genomic selection.

PubMed Disclaimer

Conflict of interest statement

The authors do not have conflict of interest.

Figures

Figure 1
Figure 1
Indirect selection based on molecular markers. (a) Generalized Manhattan plots illustrating a comparison of GWAS effectiveness in simple (left) vs. complex traits (right). Note: Bold dashed line indicates minimum threshold to select significant markers. A significant signal (i.e., QTL) was identified in the simple trait (left panel), while no defined QTL was identified for the complex trait. Therefore, genomic selection (GS) is more appropriate and practical for complex traits. (b) Common parametric and non-parametric models used in GS and their computational requirements. GBLUP, genomic best linear unbiased prediction; RRBLUP, ridge-regression BLUP; RF, random forest; SVM, support vector machine; MLP, multilayer perceptron; CNN, convolutional neural network; RNN, recurrent neural network.
Figure 2
Figure 2
Optimization of GS models. (a) GS model accuracy measured as Pearson’s correlation after 10-fold cross-validation for biomass yield under salt stress. Computing time was measured as system time in seconds to run one cross-validation. (b) Example of variable importance values derived from SVM for 10 randomly chosen SNPs. (c) Pearson’s correlation for 6796 SNPs weights obtained by variable importance (SVM, RF) or by −log10 p-values of different GWASpoly models. (d) Accuracy of GBLUP (GBLUP VR and GBLUP FA) and WGBLUP models. Accuracy was measured 10 times using Pearson’s correlation with 10-fold cross-validation. SNP weights for WGBLUP were obtained from variable importance values (SVM, RF) or −log10 p-values of different GWASpoly models. RRBLUP, best linear unbiased prediction using ridge-regression; BL Bayes LASSO; GBLUP, genomic best linear unbiased prediction; VR, VanRaden G matrix; FA, full autotetraploid G matrix; RF, random forest; SVM, support vector machine; WGBLUP, weighted GBLUP; 1-dom-alt and 1-dom-ref, simplex dominant models; 2-dom-alt and 2-dom-ref, duplex dominant models; diplo-general, diploidized general; diplo-additive, diploidized additive.

Similar articles

Cited by

References

    1. Blondon F., Marie D., Brown S., Kondorosi A. Genome size and base composition in Medicago sativa and M. truncatula species. Genome. 1994;37:264–270. doi: 10.1139/g94-037. - DOI - PubMed
    1. Elshire R.J., Glaubitz J.C., Sun Q., Poland J.A., Kawamoto K., Buckler E.S., Mitchell S.E. A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species. PLoS ONE. 2011;6:e19379. doi: 10.1371/journal.pone.0019379. - DOI - PMC - PubMed
    1. Yu L.-X., Liu X., Boge W., Liu X.-P. Genome-Wide Association Study Identifies Loci for Salt Tolerance during Germination in Autotetraploid Alfalfa (Medicago sativa L.) Using Genotyping-by-Sequencing. Front. Plant Sci. 2016;7:956. doi: 10.3389/fpls.2016.00956. - DOI - PMC - PubMed
    1. Liu X.-P., Yu L.-X. Genome-Wide Association Mapping of Loci Associated with Plant Growth and Forage Production under Salt Stress in Alfalfa (Medicago sativa L.) Front. Plant Sci. 2017;8:853. doi: 10.3389/fpls.2017.00853. - DOI - PMC - PubMed
    1. Liu X., Hawkins C., Peel M.D., Yu L. Genetic Loci Associated with Salt Tolerance in Advanced Breeding Populations of Tetraploid Alfalfa Using Genome-Wide Association Studies. Plant Genome. 2019;12:180026. doi: 10.3835/plantgenome2018.05.0026. - DOI - PubMed

Publication types

LinkOut - more resources

  NODES
Association 7
Note 1
twitter 2