Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jun;50(6):895-903.
doi: 10.1038/s41588-018-0128-6. Epub 2018 May 28.

Quantification of subclonal selection in cancer from bulk sequencing data

Affiliations

Quantification of subclonal selection in cancer from bulk sequencing data

Marc J Williams et al. Nat Genet. 2018 Jun.

Erratum in

Abstract

Subclonal architectures are prevalent across cancer types. However, the temporal evolutionary dynamics that produce tumor subclones remain unknown. Here we measure clone dynamics in human cancers by using computational modeling of subclonal selection and theoretical population genetics applied to high-throughput sequencing data. Our method determined the detectable subclonal architecture of tumor samples and simultaneously measured the selective advantage and time of appearance of each subclone. We demonstrate the accuracy of our approach and the extent to which evolutionary dynamics are recorded in the genome. Application of our method to high-depth sequencing data from breast, gastric, blood, colon and lung cancer samples, as well as metastatic deposits, showed that detectable subclones under selection, when present, consistently emerged early during tumor growth and had a large fitness advantage (>20%). Our quantitative framework provides new insight into the evolutionary trajectories of human cancers and facilitates predictive measurements in individual tumors from widely available sequencing data.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Modelling patterns of subclonal selection in sequencing data.
(a) In a stochastic branching process model of tumour growth cells have birth rate b and death rate d, mutations accumulate with rate μ. Cells with fitness advantage (orange) grow at a faster net rate (b-d) than the host population (blue). (b) The variant allele frequency (VAF) distribution contains clonal (truncal) mutations around f=0.5 (in this example of diploid tumour), and subclonal mutations (f<0.5) which encode how a tumour has grown. In the absence of subclonal selection, a neutral 1/f2 tail describes the accumulation of passenger mutations as the tumour expands. (c) A selected subclone produces an additional peak in the distribution while a 1/f2 tail is still present due to passenger mutations accumulating in both the original population and the new subclone. (d) In the presence of subclonal selection, the magnitude and average frequency of the subclonal cluster of mutations (red) encode the age and size of a subclone respectively, which in turn allows measuring the clone’s selective advantage. (e) Frequentist power analysis of detectability of an emerging selected subclone on simulated data. Only early and/or very fit subclones caused significant alterations of the clonal composition of a tumour, resulting in the rejection of the neutral (null) model. Tumours were simulated to 106 cells and scaled to a final population size of 1010 with a mutation rate of 20 mutations per genome per division, each pixel represents the average value for the metric (area between curves) over 50 simulations.
Figure 2
Figure 2. Accurate recovery of evolutionary parameters from simulated data using Approximate Bayesian Computation.
Our method recovered the correct clonal structure in simulated tumour data for representative examples of (a) a neutral case, (b) a 1 subclone case and (c) a two subclones case. Grey bars are simulated VAF data, solid red lines indicate the median histograms from the simulations that were selected by the statistical inference framework (500 posterior samples), shaded areas are 95% intervals. The inferred posterior distributions of the evolutionary parameters contained the true values (dashed lines) for (d,f) the time of emergence of the subclones and (e,g) the selection coefficient 1+s. (h) The mean percentage error in inferred parameter values across a virtual tumour cohort (n=100 tumours) was below 10%. Boxplots show the median and inter quantile range (IQR), upper whisker is 3rd quantile + 1.5*IQR and lower whisker is 1st quantile - 1.5*IQR.
Figure 3
Figure 3. Quantifying selection from high-depth bulk sequencing of human cancers.
Both (a) an acute myeloid leukemia (AML) sample and (b) a breast cancer sample sequenced at whole-genome resolution showed evidence of two selected subclones. (c) In the case of a multi-region whole-exome sequenced case of lung cancer, one sample showed evidence of a single subclone whereas four other samples (d-g) from the same patient were consistent with the neutral model. Grey bars are the data, solid red lines indicate the median histograms from the simulations that were selected by the statistical inference framework (500 posterior samples), shaded areas are the 95% intervals. (h) Bayesian model selection reports the expected clonal structure for each case (Bayes Factors reported above histograms). (i) Inferred subclone fitness advantages were 20% and 80% faster than the original population. (j) Inferred times of subclone emergence indicated subclones arose within the first 15 tumour population doublings. (k) Inferred mutation rates were of the order of 10-7 mutations per base per tumour doubling in solid tumours but ~10-9 in AML, reflecting the respective differences in mutational burden between cancer types. All posterior distributions were generated from 500 samples.
Figure 4
Figure 4. Quantifying selection in large cohorts of primary tumours and metastatic lesions.
(a) 21% of colon cancers (N=70) from TCGA (sequenced to sufficient depth and with high enough cellularity for statistical inference), 29% of WGS gastric cancers (N=17) (data from ref., filtered for cellularity) and 53% of metastases (N=113) from sites had evidence of differentially selected subclones. When present, differentially selected subclones were found to have (b) large fitness advantages with respect to the host population and (c) emerge early during growth. Bayes Factors for subclonal structures for all data are reported in Supplementary Table 4. Posterior distributions were generated from 500 samples. Boxplots show the median and inter quantile range (IQR), upper whisker is 3rd quantile + 1.5*IQR and lower whisker is 1st quantile - 1.5*IQR.
Figure 5
Figure 5. Predicting the future evolution of subclones.
(a) VAF distribution of an in silico tumour sampled at 105 cells was used to measure the fitness and time of emergence of a subclone. Grey bars are the simulated data, solid red lines indicate the median histograms from the simulations that were selected by the statistical inference framework (500 posterior samples), shaded areas are the 95% intervals. Inset shows error from ground truth. 500 posterior samples were taken to perform the inference. (b) These values were then used to predict the spread of the subclone as the tumour grew to 107 cells, showing the predictions matched the ground truth. Predictions were made by extrapolating the posterior distribution of 1+s using equations in the main text. Solid line shows the median value from the posterior distribution, shaded area shows the 95% interval. (c) Using the same approach in the AML sample, where we measured 1+s, t1 and t2, we would predict that subclone 2 would become dominant within 3-4 further tumour doublings while subclone 1 will become too small to be detected.

Similar articles

Cited by

References

    1. Greaves M, Maley CC. Clonal evolution in cancer. Nature. 2012;481:306–313. - PMC - PubMed
    1. Gay L, Baker A-M, Graham TA. Tumour Cell Heterogeneity. F1000Res. 2016;5:238–14. - PMC - PubMed
    1. Wang Y, et al. Clonal evolution in breast cancer revealed by single nucleus genome sequencing. Nature. 2014;512:155–160. - PMC - PubMed
    1. Burrell RA, Swanton C. Re-Evaluating Clonal Dominance in Cancer Evolution. Trends in Cancer. 2016 doi: 10.1016/j.trecan.2016.04.002. - DOI - PubMed
    1. Durrett R. Branching Process Models of Cancer. Springer; 2015.

Publication types

MeSH terms

LinkOut - more resources

  NODES
COMMUNITY 1
twitter 2