Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Oct 6;99(4):877-885.
doi: 10.1016/j.ajhg.2016.08.016. Epub 2016 Sep 22.

REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants

Affiliations

REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants

Nilah M Ioannidis et al. Am J Hum Genet. .

Abstract

The vast majority of coding variants are rare, and assessment of the contribution of rare variants to complex traits is hampered by low statistical power and limited functional data. Improved methods for predicting the pathogenicity of rare coding variants are needed to facilitate the discovery of disease variants from exome sequencing studies. We developed REVEL (rare exome variant ensemble learner), an ensemble method for predicting the pathogenicity of missense variants on the basis of individual tools: MutPred, FATHMM, VEST, PolyPhen, SIFT, PROVEAN, MutationAssessor, MutationTaster, LRT, GERP, SiPhy, phyloP, and phastCons. REVEL was trained with recently discovered pathogenic and rare neutral missense variants, excluding those previously used to train its constituent tools. When applied to two independent test sets, REVEL had the best overall performance (p < 10-12) as compared to any individual tool and seven ensemble methods: MetaSVM, MetaLR, KGGSeq, Condel, CADD, DANN, and Eigen. Importantly, REVEL also had the best performance for distinguishing pathogenic from rare neutral variants with allele frequencies <0.5%. The area under the receiver operating characteristic curve (AUC) for REVEL was 0.046-0.182 higher in an independent test set of 935 recent SwissVar disease variants and 123,935 putatively neutral exome sequencing variants and 0.027-0.143 higher in an independent test set of 1,953 pathogenic and 2,406 benign variants recently reported in ClinVar than the AUCs for other ensemble methods. We provide pre-computed REVEL scores for all possible human missense variants to facilitate the identification of pathogenic variants in the sea of rare variants discovered as sequencing studies expand in scale.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Individual Prediction Tools Included as Features in the REVEL Random Forest (A) Correlation among the individual features, ordered by hierarchical clustering. The heatmap illustrates the Spearman rank correlation coefficients between features computed for the REVEL training variants. (B) Relative importance of individual features. Gini importance estimates were normalized to sum to one.
Figure 2
Figure 2
Performance of Ensemble Methods for Discrimination of Disease Training Variants from Putatively Neutral ESVs (A) ROC curves for 6,182 HGMD disease mutations and 123,706 rare (AF 0.001–0.01) neutral ESVs used to train REVEL. REVEL scores were computed with only the OOB predictions for its training variants. (B) AUC for 6,182 HGMD disease mutations and 140,921 neutral ESVs, including REVEL training variants, stratified by neutral variant AF.
Figure 3
Figure 3
Performance of Ensemble Methods in an Independent Test Set of SwissVar Disease Mutations and Putatively Neutral ESVs (A) ROC curves for 935 SwissVar disease mutations and 123,935 rare (AF 0.001–0.01) neutral ESVs that did not overlap with the training set. (B) AUC for 935 SwissVar disease mutations and 141,051 neutral ESVs, excluding REVEL training variants, stratified by neutral variant AF.
Figure 4
Figure 4
Performance of Ensemble Methods in an Independent Test Set of 1,953 Pathogenic and 2,406 Benign Variants from ClinVar (A) ROC curves and the AUC for all variants. (B) AUC for each ensemble method, stratified by neutral variant AF.
Figure 5
Figure 5
Interpretation of REVEL Scores (A) Distribution of REVEL scores for 6,182 disease (red) and 123,706 neutral (blue) training variants and 1,125,160 ESVs (black). REVEL scores were computed with only the OOB predictions for training variants. (B) Percentiles of the REVEL score distribution for 6,182 disease (red) and 123,706 neutral (blue) training variants and 1,125,160 ESVs (black). REVEL scores were computed with only the OOB predictions for training variants.

Similar articles

Cited by

References

    1. Peterson T.A., Doughty E., Kann M.G. Towards precision medicine: advances in computational approaches for the analysis of human variants. J. Mol. Biol. 2013;425:4047–4063. - PMC - PubMed
    1. Yang Y., Muzny D.M., Reid J.G., Bainbridge M.N., Willis A., Ward P.A., Braxton A., Beuten J., Xia F., Niu Z. Clinical whole-exome sequencing for the diagnosis of mendelian disorders. N. Engl. J. Med. 2013;369:1502–1511. - PMC - PubMed
    1. Abecasis G.R., Auton A., Brooks L.D., DePristo M.A., Durbin R.M., Handsaker R.E., Kang H.M., Marth G.T., McVean G.A., 1000 Genomes Project Consortium An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65. - PMC - PubMed
    1. Tennessen J.A., Bigham A.W., O’Connor T.D., Fu W., Kenny E.E., Gravel S., McGee S., Do R., Liu X., Jun G., Broad GO. Seattle GO. NHLBI Exome Sequencing Project Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science. 2012;337:64–69. - PMC - PubMed
    1. Cirulli E.T., Goldstein D.B. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat. Rev. Genet. 2010;11:415–425. - PubMed
  NODES
INTERN 6
Project 2
twitter 2