REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants

doi:10.1016/j.ajhg.2016.08.016

. 2016 Oct 6;99(4):877-885.

doi: 10.1016/j.ajhg.2016.08.016. Epub 2016 Sep 22.

REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants

Nilah M Ioannidis¹, Joseph H Rothstein², Vikas Pejaver³, Sumit Middha⁴, Shannon K McDonnell⁵, Saurabh Baheti⁵, Anthony Musolf⁶, Qing Li⁶, Emily Holzinger⁶, Danielle Karyadi⁷, Lisa A Cannon-Albright⁸, Craig C Teerlink⁸, Janet L Stanford⁹, William B Isaacs¹⁰, Jianfeng Xu¹¹, Kathleen A Cooney¹², Ethan M Lange¹³, Johanna Schleutker¹⁴, John D Carpten¹⁵, Isaac J Powell¹⁶, Olivier Cussenot¹⁷, Geraldine Cancel-Tassin¹⁷, Graham G Giles¹⁸, Robert J MacInnis¹⁸, Christiane Maier¹⁹, Chih-Lin Hsieh²⁰, Fredrik Wiklund²¹, William J Catalona²², William D Foulkes²³, Diptasri Mandal²⁴, Rosalind A Eeles²⁵, Zsofia Kote-Jarai²⁵, Carlos D Bustamante²⁶, Daniel J Schaid⁵, Trevor Hastie²⁷, Elaine A Ostrander⁷, Joan E Bailey-Wilson⁶, Predrag Radivojac³, Stephen N Thibodeau²⁸, Alice S Whittemore²⁹, Weiva Sieh³⁰

Affiliations

¹ Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Health Research and Policy, Stanford University, Stanford, CA 94305, USA.
² Department of Health Research and Policy, Stanford University, Stanford, CA 94305, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
³ Department of Computer Science and Informatics, Indiana University, Bloomington, IN 47405, USA.
⁴ Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.
⁵ Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA.
⁶ Computational and Statistical Genomics Branch, National Human Genome Research Institute, Baltimore, MD 21224, USA.
⁷ Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, Bethesda, MD 20892, USA.
⁸ Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT 84108, USA.
⁹ Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.
¹⁰ Brady Urological Institute, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA.
¹¹ NorthShore University HealthSystem Research Institute, Evanston, IL 60201, USA.
¹² Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT 84108, USA; Departments of Internal Medicine and Urology, University of Michigan Medical School, Ann Arbor, MI 48109, USA.
¹³ Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
¹⁴ Department of Medical Biochemistry and Genetics, University of Turku, Turku 20014, Finland; Department of Medical Genetics, Tyks Microbiology and Genetics, Turku University Hospital, Turku 20520, Finland.
¹⁵ Integrated Cancer Genomics Division, Translational Genomics Research Institute, Phoenix, AZ 85004, USA.
¹⁶ Department of Urology, Wayne State University, Detroit, MI 48201, USA.
¹⁷ Centre de Recherche sur les Pathologies Prostatiques et Urologiques, Universite Paris, Paris, 75013, France.
¹⁸ Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, VIC 3004, Australia; Centre for Epidemiology and Biostatistics, University of Melbourne, Melbourne, VIC 3010, Australia.
¹⁹ Institute of Human Genetics, University Hospital of Ulm, Ulm 89075, Germany; Department of Urology, University Hospital of Ulm, Ulm 89075, Germany.
²⁰ Department of Urology, University of Southern California, Los Angeles, CA 90033, USA.
²¹ Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 171 77, Sweden.
²² Department of Urology, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA.
²³ Departments of Oncology and Human Genetics, Montreal General Hospital, Montreal, QC H3G 1A4, Canada.
²⁴ Department of Genetics, Louisiana State University Health Sciences Center, New Orleans, LA 70112, USA.
²⁵ Division of Genetics and Epidemiology, Institute of Cancer Research, London SM2 5NG, UK.
²⁶ Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA.
²⁷ Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA; Department of Statistics, Stanford University, Stanford, CA 94305, USA.
²⁸ Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA.
²⁹ Department of Health Research and Policy, Stanford University, Stanford, CA 94305, USA; Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA.
³⁰ Department of Health Research and Policy, Stanford University, Stanford, CA 94305, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA. Electronic address: weiva.sieh@mssm.edu.

PMID: 27666373
PMCID: PMC5065685
DOI: 10.1016/j.ajhg.2016.08.016

REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants

Nilah M Ioannidis et al. Am J Hum Genet. 2016.

. 2016 Oct 6;99(4):877-885.

doi: 10.1016/j.ajhg.2016.08.016. Epub 2016 Sep 22.

Authors

Affiliations

¹ Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Health Research and Policy, Stanford University, Stanford, CA 94305, USA.
² Department of Health Research and Policy, Stanford University, Stanford, CA 94305, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
³ Department of Computer Science and Informatics, Indiana University, Bloomington, IN 47405, USA.
⁴ Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.
⁵ Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA.
⁶ Computational and Statistical Genomics Branch, National Human Genome Research Institute, Baltimore, MD 21224, USA.
⁷ Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, Bethesda, MD 20892, USA.
⁸ Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT 84108, USA.
⁹ Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.
¹⁰ Brady Urological Institute, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA.
¹¹ NorthShore University HealthSystem Research Institute, Evanston, IL 60201, USA.
¹² Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT 84108, USA; Departments of Internal Medicine and Urology, University of Michigan Medical School, Ann Arbor, MI 48109, USA.
¹³ Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
¹⁴ Department of Medical Biochemistry and Genetics, University of Turku, Turku 20014, Finland; Department of Medical Genetics, Tyks Microbiology and Genetics, Turku University Hospital, Turku 20520, Finland.
¹⁵ Integrated Cancer Genomics Division, Translational Genomics Research Institute, Phoenix, AZ 85004, USA.
¹⁶ Department of Urology, Wayne State University, Detroit, MI 48201, USA.
¹⁷ Centre de Recherche sur les Pathologies Prostatiques et Urologiques, Universite Paris, Paris, 75013, France.
¹⁸ Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, VIC 3004, Australia; Centre for Epidemiology and Biostatistics, University of Melbourne, Melbourne, VIC 3010, Australia.
¹⁹ Institute of Human Genetics, University Hospital of Ulm, Ulm 89075, Germany; Department of Urology, University Hospital of Ulm, Ulm 89075, Germany.
²⁰ Department of Urology, University of Southern California, Los Angeles, CA 90033, USA.
²¹ Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 171 77, Sweden.
²² Department of Urology, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA.
²³ Departments of Oncology and Human Genetics, Montreal General Hospital, Montreal, QC H3G 1A4, Canada.
²⁴ Department of Genetics, Louisiana State University Health Sciences Center, New Orleans, LA 70112, USA.
²⁵ Division of Genetics and Epidemiology, Institute of Cancer Research, London SM2 5NG, UK.
²⁶ Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA.
²⁷ Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA; Department of Statistics, Stanford University, Stanford, CA 94305, USA.
²⁸ Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA.
²⁹ Department of Health Research and Policy, Stanford University, Stanford, CA 94305, USA; Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA.
³⁰ Department of Health Research and Policy, Stanford University, Stanford, CA 94305, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA. Electronic address: weiva.sieh@mssm.edu.

PMID: 27666373
PMCID: PMC5065685
DOI: 10.1016/j.ajhg.2016.08.016

Abstract

The vast majority of coding variants are rare, and assessment of the contribution of rare variants to complex traits is hampered by low statistical power and limited functional data. Improved methods for predicting the pathogenicity of rare coding variants are needed to facilitate the discovery of disease variants from exome sequencing studies. We developed REVEL (rare exome variant ensemble learner), an ensemble method for predicting the pathogenicity of missense variants on the basis of individual tools: MutPred, FATHMM, VEST, PolyPhen, SIFT, PROVEAN, MutationAssessor, MutationTaster, LRT, GERP, SiPhy, phyloP, and phastCons. REVEL was trained with recently discovered pathogenic and rare neutral missense variants, excluding those previously used to train its constituent tools. When applied to two independent test sets, REVEL had the best overall performance (p < 10^-12) as compared to any individual tool and seven ensemble methods: MetaSVM, MetaLR, KGGSeq, Condel, CADD, DANN, and Eigen. Importantly, REVEL also had the best performance for distinguishing pathogenic from rare neutral variants with allele frequencies <0.5%. The area under the receiver operating characteristic curve (AUC) for REVEL was 0.046-0.182 higher in an independent test set of 935 recent SwissVar disease variants and 123,935 putatively neutral exome sequencing variants and 0.027-0.143 higher in an independent test set of 1,953 pathogenic and 2,406 benign variants recently reported in ClinVar than the AUCs for other ensemble methods. We provide pre-computed REVEL scores for all possible human missense variants to facilitate the identification of pathogenic variants in the sea of rare variants discovered as sequencing studies expand in scale.

PubMed Disclaimer

Figures

**Figure 1**
Individual Prediction Tools Included as Features in the REVEL Random Forest (A) Correlation among the individual features, ordered by hierarchical clustering. The heatmap illustrates the Spearman rank correlation coefficients between features computed for the REVEL training variants. (B) Relative importance of individual features. Gini importance estimates were normalized to sum to one.

**Figure 2**
Performance of Ensemble Methods for Discrimination of Disease Training Variants from Putatively Neutral ESVs (A) ROC curves for 6,182 HGMD disease mutations and 123,706 rare (AF 0.001–0.01) neutral ESVs used to train REVEL. REVEL scores were computed with only the OOB predictions for its training variants. (B) AUC for 6,182 HGMD disease mutations and 140,921 neutral ESVs, including REVEL training variants, stratified by neutral variant AF.

**Figure 3**
Performance of Ensemble Methods in an Independent Test Set of SwissVar Disease Mutations and Putatively Neutral ESVs (A) ROC curves for 935 SwissVar disease mutations and 123,935 rare (AF 0.001–0.01) neutral ESVs that did not overlap with the training set. (B) AUC for 935 SwissVar disease mutations and 141,051 neutral ESVs, excluding REVEL training variants, stratified by neutral variant AF.

**Figure 4**
Performance of Ensemble Methods in an Independent Test Set of 1,953 Pathogenic and 2,406 Benign Variants from ClinVar (A) ROC curves and the AUC for all variants. (B) AUC for each ensemble method, stratified by neutral variant AF.

**Figure 5**
Interpretation of REVEL Scores (A) Distribution of REVEL scores for 6,182 disease (red) and 123,706 neutral (blue) training variants and 1,125,160 ESVs (black). REVEL scores were computed with only the OOB predictions for training variants. (B) Percentiles of the REVEL score distribution for 6,182 disease (red) and 123,706 neutral (blue) training variants and 1,125,160 ESVs (black). REVEL scores were computed with only the OOB predictions for training variants.

See this image and copyright information in PMC

Cited by

Effect of an autism-associated KCNMB2 variant, G124R, on BK channel properties.
Moldenhauer HJ, Dinsdale RL, Alvarez S, Fernández-Jaén A, Meredith AL. Moldenhauer HJ, et al. Curr Res Physiol. 2022 Sep 25;5:404-413. doi: 10.1016/j.crphys.2022.09.001. eCollection 2022. Curr Res Physiol. 2022. PMID: 36203817 Free PMC article.
Assessing performance of pathogenicity predictors using clinically relevant variant datasets.
Gunning AC, Fryer V, Fasham J, Crosby AH, Ellard S, Baple EL, Wright CF. Gunning AC, et al. J Med Genet. 2021 Aug;58(8):547-555. doi: 10.1136/jmedgenet-2020-107003. Epub 2020 Aug 25. J Med Genet. 2021. PMID: 32843488 Free PMC article.
Germline Variants in DNA Damage Repair Genes and HOXB13 Among Black Patients With Early-Onset Prostate Cancer.
Trendowski MR, Sample C, Baird T, Sadeghpour A, Moon D, Ruterbusch JJ, Beebe-Dimmer JL, Cooney KA. Trendowski MR, et al. JCO Precis Oncol. 2022 Nov;6:e2200460. doi: 10.1200/PO.22.00460. JCO Precis Oncol. 2022. PMID: 36446039 Free PMC article.
Germline variant analysis from a cohort of patients with severe hypertriglyceridemia in Brazil.
Mendes C, Loureiro T, Villela D, Bittencourt MI, Sobreira J, Bermeo D, Gomes M, Alencar D, de Castro LSS, Fock RA, Tinoco ML, Galvão H, Scapulatempo-Neto C, Schiavetti K, Senerchia AA, Gurgel MHC. Mendes C, et al. Mol Genet Metab Rep. 2024 Jun 7;40:101100. doi: 10.1016/j.ymgmr.2024.101100. eCollection 2024 Sep. Mol Genet Metab Rep. 2024. PMID: 38933898 Free PMC article.
Characterization of CRB1 splicing in retinal organoids derived from a patient with adult-onset rod-cone dystrophy caused by the c.1892A>G and c.2548G>A variants.
Zhang X, Thompson JA, Zhang D, Charng J, Arunachalam S, McLaren TL, Lamey TM, De Roach JN, Jennings L, McLenachan S, Chen FK. Zhang X, et al. Mol Genet Genomic Med. 2020 Nov;8(11):e1489. doi: 10.1002/mgg3.1489. Epub 2020 Sep 15. Mol Genet Genomic Med. 2020. PMID: 32931148 Free PMC article.

See all "Cited by" articles

References

1. Peterson T.A., Doughty E., Kann M.G. Towards precision medicine: advances in computational approaches for the analysis of human variants. J. Mol. Biol. 2013;425:4047–4063. - PMC - PubMed
1. Yang Y., Muzny D.M., Reid J.G., Bainbridge M.N., Willis A., Ward P.A., Braxton A., Beuten J., Xia F., Niu Z. Clinical whole-exome sequencing for the diagnosis of mendelian disorders. N. Engl. J. Med. 2013;369:1502–1511. - PMC - PubMed
1. Abecasis G.R., Auton A., Brooks L.D., DePristo M.A., Durbin R.M., Handsaker R.E., Kang H.M., Marth G.T., McVean G.A., 1000 Genomes Project Consortium An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65. - PMC - PubMed
1. Tennessen J.A., Bigham A.W., O’Connor T.D., Fu W., Kenny E.E., Gravel S., McGee S., Do R., Liu X., Jun G., Broad GO. Seattle GO. NHLBI Exome Sequencing Project Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science. 2012;337:64–69. - PMC - PubMed
1. Cirulli E.T., Goldstein D.B. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat. Rev. Genet. 2010;11:415–425. - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

[1] Peterson T.A., Doughty E., Kann M.G. Towards precision medicine: advances in computational approaches for the analysis of human variants. J. Mol. Biol. 2013;425:4047–4063. - PMC - PubMed

[2] Peterson T.A., Doughty E., Kann M.G. Towards precision medicine: advances in computational approaches for the analysis of human variants. J. Mol. Biol. 2013;425:4047–4063. - PMC - PubMed

[3] Yang Y., Muzny D.M., Reid J.G., Bainbridge M.N., Willis A., Ward P.A., Braxton A., Beuten J., Xia F., Niu Z. Clinical whole-exome sequencing for the diagnosis of mendelian disorders. N. Engl. J. Med. 2013;369:1502–1511. - PMC - PubMed

[4] Yang Y., Muzny D.M., Reid J.G., Bainbridge M.N., Willis A., Ward P.A., Braxton A., Beuten J., Xia F., Niu Z. Clinical whole-exome sequencing for the diagnosis of mendelian disorders. N. Engl. J. Med. 2013;369:1502–1511. - PMC - PubMed

[5] Abecasis G.R., Auton A., Brooks L.D., DePristo M.A., Durbin R.M., Handsaker R.E., Kang H.M., Marth G.T., McVean G.A., 1000 Genomes Project Consortium An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65. - PMC - PubMed

[6] Abecasis G.R., Auton A., Brooks L.D., DePristo M.A., Durbin R.M., Handsaker R.E., Kang H.M., Marth G.T., McVean G.A., 1000 Genomes Project Consortium An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65. - PMC - PubMed

[7] Tennessen J.A., Bigham A.W., O’Connor T.D., Fu W., Kenny E.E., Gravel S., McGee S., Do R., Liu X., Jun G., Broad GO. Seattle GO. NHLBI Exome Sequencing Project Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science. 2012;337:64–69. - PMC - PubMed

[8] Tennessen J.A., Bigham A.W., O’Connor T.D., Fu W., Kenny E.E., Gravel S., McGee S., Do R., Liu X., Jun G., Broad GO. Seattle GO. NHLBI Exome Sequencing Project Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science. 2012;337:64–69. - PMC - PubMed

[9] Cirulli E.T., Goldstein D.B. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat. Rev. Genet. 2010;11:415–425. - PubMed

[10] Cirulli E.T., Goldstein D.B. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat. Rev. Genet. 2010;11:415–425. - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants

Affiliations

REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources