Abstract
Major international projects are underway that are aimed at creating a comprehensive catalogue of all the genes responsible for the initiation and progression of cancer1,2,3,4,5,6,7,8,9. These studies involve the sequencing of matched tumour–normal samples followed by mathematical analysis to identify those genes in which mutations occur more frequently than expected by random chance. Here we describe a fundamental problem with cancer genome studies: as the sample size increases, the list of putatively significant genes produced by current analytical methods burgeons into the hundreds. The list includes many implausible genes (such as those encoding olfactory receptors and the muscle protein titin), suggesting extensive false-positive findings that overshadow true driver events. We show that this problem stems largely from mutational heterogeneity and provide a novel analytical methodology, MutSigCV, for resolving the problem. We apply MutSigCV to exome sequences from 3,083 tumour–normal pairs and discover extraordinary variation in mutation frequency and spectrum within cancer types, which sheds light on mutational processes and disease aetiology, and in mutation frequency across the genome, which is strongly correlated with DNA replication timing and also with transcriptional activity. By incorporating mutational heterogeneity into the analyses, MutSigCV is able to eliminate most of the apparent artefactual findings and enable the identification of genes truly associated with cancer.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
We are sorry, but there is no personal subscription option available for your country.
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
The Cancer Genome Atlas Research Network. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455, 1061–1068 (2008)
The Cancer Genome Atlas Research Network. Integrated genomic analyses of ovarian carcinoma. Nature 474, 609–615 (2011)
The Cancer Genome Atlas Research Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012)
Ding, L. et al. Somatic mutations affect key pathways in lung adenocarcinoma. Nature 455, 1069–1075 (2008)
Stransky, N. et al. The mutational landscape of head and neck squamous cell carcinoma. Science 333, 1157–1160 (2011)
Chapman, M. A. et al. Initial genome sequencing and analysis of multiple myeloma. Nature 471, 467–472 (2011)
Wang, L. et al. SF3B1 and other novel cancer genes in chronic lymphocytic leukemia. N. Engl. J. Med. 365, 2497–2506 (2011)
Morin, R. D. et al. Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma. Nature 476, 298–303 (2011)
Lohr, J. G. et al. Discovery and prioritization of somatic mutations in diffuse large B-cell lymphoma (DLBCL) by whole-exome sequencing. Proc. Natl Acad. Sci. USA 109, 3879–3884 (2012)
The Cancer Genome Atlas Research Network. Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 519–525 (2012)
Shibata, T. et al. Cancer related mutations in NRF2 impair its recognition by Keap1-Cul3 E3 ligase and promote malignancy. Proc. Natl Acad. Sci. USA 105, 13568–13573 (2008)
Stephens, P. J. et al. The landscape of cancer genes and mutational processes in breast cancer. Nature 486, 400–404 (2012)
Berger, M. F. et al. Melanoma genome sequencing reveals frequent PREX2 mutations. Nature 485, 502–506 (2012)
Parsons, D. W. et al. An integrated genomic analysis of human glioblastoma multiforme. Science 321, 1807–1812 (2008)
Greenman, C. et al. Patterns of somatic mutation in human cancer genomes. Nature 446, 153–158 (2007)
Kan, Z. et al. Diverse somatic mutation patterns and pathway alterations in human cancers. Nature 466, 869–873 (2010)
Pleasance, E. D. et al. A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature 463, 184–190 (2010)
Pleasance, E. D. et al. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463, 191–196 (2010)
Nik-Zainal, S. et al. Mutational processes molding the genomes of 21 breast cancers. Cell 149, 979–993 (2012)
Roberts, S. A. et al. Clustered mutations in yeast and in human cancers can arise from damaged long single-strand DNA regions. Mol. Cell 46, 424–435 (2012)
Vartanian, J. P., Guetard, D., Henry, M. & Wain-Hobson, S. Evidence for editing of human papillomavirus DNA by APOBEC3 in benign and precancerous lesions. Science 320, 230–233 (2008)
Walboomers, J. M. et al. Human papillomavirus is a necessary cause of invasive cervical cancer worldwide. J. Pathol. 189, 12–19 (1999)
Jimenez-Pacheco, A., Exposito-Ruiz, M., Arrabal-Polo, M. A. & Lopez-Luque, A. J. Meta-analysis of studies analyzing the role of human papillomavirus in the development of bladder carcinoma. Korean J. Urol. 53, 240–247 (2012)
Hodgkinson, A. & Eyre-Walker, A. Variation in the mutation rate across mammalian genomes. Nature Rev. Genet. 12, 756–766 (2011)
Fousteri, M. & Mullenders, L. H. Transcription-coupled nucleotide excision repair in mammalian cells: molecular mechanisms and biological effects. Cell Res. 18, 73–84 (2008)
Stamatoyannopoulos, J. A. et al. Human mutation rate associated with DNA replication timing. Nature Genet. 41, 393–395 (2009)
Chen, C. L. et al. Impact of replication timing on non-CpG and CpG substitution rates in mammalian genomes. Genome Res. 20, 447–457 (2010)
Koren, A. et al. Differential relationship of DNA replication timing to different forms of human mutation and variation. Am. J. Hum. Genet. 91, 1033–1040 (2012)
Landau, D. A. et al. Evolution and impact of subclonal mutations in chronic lymphocytic leukemia. Cell 152, 714–726 (2013)
Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nature Biotechnol. 31, 213–219 (2013)
Acknowledgements
This work was conducted as part of TCGA, a project of the National Cancer Institute and National Human Genome Research Institute. This work was conducted as part of the Slim Initiative for Genomic Medicine, a joint US–Mexico project founded by the Carlos Slim Health Institute. Support to D.A.G. and S.A.R. was through the Intramural Research Program of the National Institute of Environmental Health Sciences (National Institutes of Health, United States Department of Health and Human Services) project ES065073 (principal investigator M. Resnick).
Author information
Authors and Affiliations
Contributions
G.G., E.S.L., S.S., D.A.G., T.R.G., M.M., L.A.G., A.J.B., K.S., J.A.B., C.W.M.R., S.B.G., C.J.W., S.A.M., J.M.-Z. and A.H.-M. conceived the project and provided leadership. C.So., L.A., E.N., E.S., M.L.C., D.A., W.W. and K.A. provided project management. W.W., K.A., T.F., R.O. and M.P. planned and carried out DNA sequencing and genetic analysis. T.F., D.V., G.S., M.N., D.D., P.L., L.L. and D.I.H. developed and engineered software to support the project. M.S.L., P.S., P.P., G.V.K., K.C., A.S., S.L.C., C.St., C.H.M., S.A.R., A.Ki., P.S.H., A.M., Y.D., L.Z., A.H.R., T.J.P., N.S., E.H., J.K., M.I., B.H., E.H., S.B., A.M.D., J.L., D.-A.L., C.J.W., J.M.-Z., A.H.-M., A.Ko., S.A.M., R.S.L., J.M., B.C., A.J.B. and D.A.G. analysed the data and contributed to scientific discussions. M.S.L., P.S., P.P., E.S.L. and G.G. wrote the paper.
Corresponding authors
Ethics declarations
Competing interests
A patent application has been filed relating to this work.
Supplementary information
Supplementary Information
This file contains Supplementary Figures 1-11, Supplementary Methods 0-3, legends for Supplementary Tables 1-7 (see separate excel file for Supplementary Tables) and additional references. (PDF 1974 kb)
Supplementary Tables
This file contains Supplementary Tables 1-7. (XLS 9400 kb)
Rights and permissions
About this article
Cite this article
Lawrence, M., Stojanov, P., Polak, P. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013). https://doi.org/10.1038/nature12213
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nature12213
This article is cited by
-
Multi-omic profiling reveals associations between the gut microbiome, host genome and transcriptome in patients with colorectal cancer
Journal of Translational Medicine (2024)
-
Gsw-fi: a GLM model incorporating shrinkage and double-weighted strategies for identifying cancer driver genes with functional impact
BMC Bioinformatics (2024)
-
Exploring gene-patient association to identify personalized cancer driver genes by linear neighborhood propagation
BMC Bioinformatics (2024)
-
Mesoscale DNA features impact APOBEC3A and APOBEC3B deaminase activity and shape tumor mutational landscapes
Nature Communications (2024)
-
Translating p53-based therapies for cancer into the clinic
Nature Reviews Cancer (2024)