Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq

Abstract

The mammalian lung is a highly branched network in which the distal regions of the bronchial tree transform during development into a densely packed honeycomb of alveolar air sacs that mediate gas exchange. Although this transformation has been studied by marker expression analysis and fate-mapping, the mechanisms that control the progression of lung progenitors along distinct lineages into mature alveolar cell types are still incompletely known, in part because of the limited number of lineage markers1,2,3 and the effects of ensemble averaging in conventional transcriptome analysis experiments on cell populations1,2,3,4,5. Here we show that single-cell transcriptome analysis circumvents these problems and enables direct measurement of the various cell types and hierarchies in the developing lung. We used microfluidic single-cell RNA sequencing (RNA-seq) on 198 individual cells at four different stages encompassing alveolar differentiation to measure the transcriptional states which define the developmental and cellular hierarchy of the distal mouse lung epithelium. We empirically classified cells into distinct groups by using an unbiased genome-wide approach that did not require a priori knowledge of the underlying cell types or the previous purification of cell populations. The results confirmed the basic outlines of the classical model of epithelial cell-type diversity in the distal lung and led to the discovery of many previously unknown cell-type markers, including transcriptional regulators that discriminate between the different populations. We reconstructed the molecular steps during maturation of bipotential progenitors along both alveolar lineages and elucidated the full life cycle of the alveolar type 2 cell lineage. This single-cell genomics approach is applicable to any developing or mature tissue to robustly delineate molecularly distinct cell types, define progenitors and lineage hierarchies, and identify lineage-specific regulatory factors.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Single-cell RNA-seq of 80 embryonic (E18.5) mouse lung epithelial cells enables unbiased identification of alveolar, bronchiolar and progenitor cell populations.
Figure 2: Single-cell transcriptome analysis discovers previously unknown markers.
Figure 3: Molecular profiles distinguish between developmental intermediates during the differentiation of AT1 and AT2 cells from a common BP.
Figure 4: Single-cell RNA-seq of Sftpc+ cells at E14.5, E16.5, E18.5 and in the adult mouse lung explains progressive transcriptional states of the AT2 cell lineage throughout its life cycle.

Similar content being viewed by others

Accession codes

Primary accessions

Gene Expression Omnibus

Data deposits

The transcriptome sequencing data for all single cells has been deposited in the Gene Expression Omnibus database under accession number GSE52583.

References

  1. Kim, C. F. B. et al. Identification of bronchioalveolar stem cells in normal lung and lung cancer. Cell 121, 823–835 (2005)

    Article  CAS  PubMed  Google Scholar 

  2. Zemke, A. C. et al. Molecular staging of epithelial maturation using secretory cell-specific genes as markers. Am. J. Respir. Cell Mol. Biol. 40, 340–348 (2009)

    Article  CAS  PubMed  Google Scholar 

  3. Guha, A. et al. Neuroepithelial body microenvironment is a niche for a distinct subset of Clara-like precursors in the developing airways. Proc. Natl Acad. Sci. USA 109, 12592–12597 (2012)

    Article  CAS  ADS  PubMed  PubMed Central  Google Scholar 

  4. Gonzalez, R. et al. Freshly isolated rat alveolar type I cells, type II cells, and cultured type II cells have distinct molecular phenotypes. Am. J. Physiol. Lung Cell. Mol. Physiol. 288, L179–L189 (2005)

    Article  CAS  PubMed  Google Scholar 

  5. Xu, Y. et al. Transcriptional programs controlling perinatal lung maturation. PLoS ONE 7, e37046 (2012)

    Article  CAS  ADS  PubMed  PubMed Central  Google Scholar 

  6. Desai, T. J., Brownfield, D. G. & Krasnow, M. A. Alveolar progenitor and stem cells in lung development, renewal and cancer. Nature 507, 190–194 (2014)

    Article  CAS  ADS  PubMed  PubMed Central  Google Scholar 

  7. Wu, A. R. et al. Quantitative assessment of single-cell RNA-sequencing methods. Nature Methods 11, 41–46 (2013)

    Article  PubMed  PubMed Central  Google Scholar 

  8. Islam, S. et al. Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Res. 21, 1160–1167 (2011)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Islam, S. et al. Highly multiplexed and strand-specific single-cell RNA 5′ end sequencing. Nature Protocols 7, 813–828 (2012)

    Article  CAS  PubMed  Google Scholar 

  10. Shalek, A. K. et al. Single-cell transcriptomics reveals bimodality inexpression and splicing in immune cells. Nature 498, 236–240 (2013)

    Article  CAS  ADS  PubMed  PubMed Central  Google Scholar 

  11. Sasagawa, Y. et al. Quartz-Seq: a highly reproducible and sensitive single-cell RNA-Seq reveals non-genetic gene expression heterogeneity. Genome Biol. 14, R31 (2013)

    Article  PubMed  PubMed Central  Google Scholar 

  12. Liu, C. L., Bernstein, B. E. & Schreiber, S. L. Whole genome amplification by T7-based linear amplification of DNA (TLAD). II. Second-strand synthesis and in vitro transcription. CSH Protocols, http://dx.doi.org/10.1101/pdb.prot5003 (2008)

  13. Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification. Cell Rep. 2, 666–673 (2012)

    Article  CAS  PubMed  Google Scholar 

  14. Ramskold, D. et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nature Biotechnol. 30, 777–782 (2012)

    Article  Google Scholar 

  15. Tariq, M. A., Kim, H. J., Jejelowo, O. & Pourmand, N. Whole-transcriptome RNAseq analysis from minute amount of total RNA. Nucleic Acids Res. 39, e120 (2011)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nature Methods 10, 1096–1098 (2013)

    Article  CAS  PubMed  Google Scholar 

  17. Brennecke, P. et al. Accounting for technical noise in single-cell RNA-seq experiments. Nature Methods 10, 1093–1095 (2013)

    Article  CAS  PubMed  Google Scholar 

  18. Tang, F. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nature Methods 6, 377–382 (2009)

    Article  CAS  PubMed  Google Scholar 

  19. Tang, F. et al. RNA-Seq analysis to capture the transcriptome landscape of a single cell. Nature Protocols 5, 516–535 (2010)

    Article  CAS  PubMed  Google Scholar 

  20. Tang, F. et al. Tracing the derivation of embryonic stem cells from the inner cell mass by single-cell RNA-Seq analysis. Cell Stem Cell 6, 468–478 (2010)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Huang, D. W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protocols 4, 44–57 (2009)

    Article  CAS  Google Scholar 

  22. Yin, Z. et al. Hop functions downstream of Nkx2.1 and GATA6 to mediate HDAC-dependent negative regulation of pulmonary gene expression. Am. J. Physiol. Lung Cell. Mol. Physiol. 291, L191–L199 (2006)

    Article  CAS  PubMed  Google Scholar 

  23. Sock, E. et al. Gene _targeting reveals a widespread role for the high-mobility-group transcription factor Sox11 in tissue remodeling. Mol. Cell. Biol. 24, 6635–6644 (2004)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Wang, X. et al. Gene expression profiling and chromatin immunoprecipitation identify DBN1, SETMAR and HIG2 as direct _targets of SOX11 in mantle cell lymphoma. PLoS ONE 5, e14085 (2010)

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  25. Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Dalerba, P. et al. Single-cell dissection of transcriptional heterogeneity in human colon tumors. Nature Biotechnol. 29, 1120–1127 (2011)

    Article  CAS  Google Scholar 

  27. R core team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computinghttp://www.R-project.org/

  28. Chapman, H. A. et al. Integrin α6β4 identifies an adult distal lung epithelial population with regenerative potential in mice. J. Clin. Invest. 121, 2855–2862 (2011)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Takeda, N. et al. Interconversion between intestinal stem cell populations in distinct niches. Science 334, 1420–1424 (2011)

    Article  CAS  ADS  PubMed  PubMed Central  Google Scholar 

  30. Babraham Institute. Babraham Bioinformatics. FASTQC. http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc

  31. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17, 10–12 (2011)

    Article  Google Scholar 

  32. Schmieder, R. & Edwards, R. Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863–864 (2011)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009)

    Article  PubMed  PubMed Central  Google Scholar 

  34. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nature Methods 9, 357–359 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009)

    PubMed  PubMed Central  Google Scholar 

  36. Baker, S. C. et al. The External RNA Controls Consortium: a progress report. Nature Methods 2, 731–734 (2005)

    Article  CAS  PubMed  Google Scholar 

  37. Jiang, L. et al. Synthetic spike-in standards for RNA-seq experiments. Genome Res. 21, 1543–1551 (2011)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Zhang, H.-M. et al. AnimalTFDB: a comprehensive animal transcription factor database. Nucleic Acids Res. 40, D144–D149 (2012)

    Article  CAS  PubMed  Google Scholar 

  39. Walker, M. G., Volkmuth, W., Sprinzak, E., Hodgson, D. & Klingler, T. Prediction of gene function by genome-scale expression analysis: prostate cancer-associated genes. Genome Res. 9, 1198–1203 (1999)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Huang, D. W., Sherman, B. T. & Lempicki, R. A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37, 1–13 (2009)

    Article  Google Scholar 

  41. Greif, D. M. et al. Radial construction of an arterial wall. Dev. Cell 23, 482–493 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank W. Koh and B. Passarelli for help and discussions regarding bioinformatic pipelines and statistical analysis, S. I. Gonzalez for help with immunofluorescence, and J. G. Camp and members of the Krasnow laboratory for critical discussion and reading of the manuscript. This work was supported by National Heart, Lung, and Blood Institute (NHLBI) U01HL099995 Progenitor Cell Biology Consortium Grant (B.T., M.A.K., S.R.Q.), by National Institutes of Health (NIH) T32HD007249 (D.G.B.), by a Parker B. Francis Foundation Fellowship and NIH 5KO8HL084095 Award (T.J.D.), and by NIH grant U01HL099999 (A.R.W., N.F.N.). M.A.K. and S.R.Q. are investigators of the Howard Hughes Medical Institute.

Author information

Authors and Affiliations

Authors

Contributions

B.T., D.G.B., T.D., M.A.K. and S.R.Q. conceived the study and designed the experiments. B.T., D.G.B., F.H.E., A.R.W., N.F.N., G.L.M. and T.D. performed the experiments. B.T., D.G.B., A.R.W., F.H.E., T.D., M.A.K. and S.R.Q. analysed the data and/or provided intellectual guidance in their interpretation. B.T., D.G.B., F.H.E., T.D., M.K. and S.R.Q. wrote the paper.

Corresponding authors

Correspondence to Tushar J. Desai, Mark A. Krasnow or Stephen R. Quake.

Ethics declarations

Competing interests

S.R.Q. is a founder and consultant for Fluidigm Corporation.

Extended data figures and tables

Extended Data Figure 1 Schematic illustration of the process of sacculation.

a, Schematic illustration of morphological and molecular changes of the distal airways during development. Cell differentiation progresses in a directional manner from the bronchio-alveolar junction (proximal) to the distal tip (distal) of each terminal airway; progenitor cells therefore persist the longest at the tips. Ciliated (green) and Clara (blue) cells mature first, followed by the differentiation of flat alveolar type 1 (AT1, orange) and cuboidal type 2 (AT2, red) cells from cuboidal alveolar progenitors during sacculation (E16–18.5), when distal airway tubules widen as nascent AT1 cells flatten to form the gas-exchange surface. b, Micrographs of alveolar (E18.5, postnatal 3 days (PN3d)) and bronchiolar (PN3d) sections of a mouse lung co-stained for Clara (Scgb1a1, green) and ciliated (Foxj1, red) cell markers as well as AT1 (Pdpn, green) and AT2 (Sftpc, red) specific markers. Progenitor cells at the tips of sacculating alveoli are detected by an overlap of AT1 and AT2 specific markers. Newly forming alveolar sacs are marked by asterisks.

Extended Data Figure 2 Single-cell transcriptomics analysis workflow.

a, Workflow of single-cell transcriptomics analysis of mouse lung epithelial cells. A single captured lung epithelial cell stained with Alexa488 for EpCAM (green) is indicated by a red arrow. b, Single lung epithelial cells captured in microfluidic chips with capture sites designed to trap cells with a diameter of 10–17 μm (medium, left) or 17–25 μm (large, right). Cells were stained for viability with Calcein AM. Even cells captured by the large chip did not exceed a diameter of 15 μm, indicating that the medium-sized chips are sufficient for comprehensively profiling distal mouse lung epithelial cells.

Extended Data Figure 3 Assessment of required sequencing depth, technical and biological variation, dynamic range and reproducibility of single-cell RNA-seq data of 80 single distal lung epithelial cells at E18.5.

a, Saturation analysis reveals the sequencing depth required for the detection of most genes expressed by single cells. To detect most expressed genes, single-cell RNA-seq libraries have to be sequenced only to a depth of about 106 reads, whereas libraries of bulk samples have to be sequenced more deeply. The number of genes detected in the ensemble of all single cells (synthetic bulk) is comparable to the number of genes detected in the true bulk experiment. Each point on the saturation curve was generated by randomly selecting a number of raw reads from each sample library (bulk, 200 cell bulk library; single cell, single-cell RNA-seq libraries of 80 lung epithelial cells; single-cell ensemble, bioinformatically pooled single-cell libraries) and then using the same alignment pipeline to call genes with a mean FPKM of more than 1. Each point represents four replicate subsamplings; error bars represent s.e.m. b, Technical noise and biological variation in single-cell RNA-seq data. Relationship between mean expression level and coefficient of variation for 10,946 genes in single embryonic lung epithelial cells. Several genes show strong biological variation (blue): they show higher variability than the average noise at a given average gene expression. Housekeeping genes are shown in yellow. c, Average detected transcript levels (mean FPKM, log2) for 92 ERCC RNA spike-ins as a function of provided number of molecules per lysis reaction for each of the three independent single-cell RNA-seq experiments performed at E18.5. Linear regression fits through data points are shown. The length of each ERCC RNA spike-in transcript is encoded in the size and colour of the data points. No particular bias towards the detection of shorter versus longer transcripts is observed. The method shows single transcript sensitivity as well as a dynamic range of approximately six orders of magnitude, in agreement with a previous study evaluating microfluidic single-cell RNA-seq7. d, e, Correlation between transcript levels of a 200-cell population and median transcript levels of single cells of the same pool of embryonic lungs (d), and transcript levels of two single AT2 cells (e). r, Pearson correlation coefficients. f, g, Correlation between transcript levels of all genes detected in the single lung and the pooled lung experiment (f) and between transcript levels of all genes detected in the two independent experiments on pooled embryonic lungs (g). Pearson correlation coefficients r are given.

Extended Data Figure 4 Lineage-specific genes identified by single-cell transcriptome analysis allow functional description of individual distal lung epithelial cell populations.

a, Results of gene ontology (GO) and KEGG pathway enrichment analyses for distal lung epithelial cell types based on lineage-specific genes identified by single-cell RNA-seq of 80 E18.5 distal lung epithelial cells (Supplementary Data). b, c, Correlograms visualizing correlation of single-cell gene expression profiles between transcription factors (b) or receptors/ligands (c) and the major canonical marker genes for bronchiolar and alveolar lineages (AT1: Pdpn; AT2: Sftpc; Clara: Scgb1a1; ciliated: Foxj1). The colour bar denotes the Pearson correlation coefficient from −1 (blue, anticorrelated genes) to 1 (green, positively correlated genes). d, Validation of previously unknown marker genes by single-cell multiplexed qPCR on 74 single cells isolated from the distal mouse lung epithelium at E18.5. Lineage-specific expression of seven new marker genes is shown by clustering with known markers for respective lineages (AT2, red, previously unknown: Cftr, Cebpa, Sftpd and Id2; AT1, orange, previously unknown: Vegfa; ciliated, green, previously unknown: Itgb4 and Top2a; Clara, blue). e, Validation of Hopx expression in AT1 cells. A lung section from a transgenic Hopx>GFP adult mouse (Hopx-Cre-ERT2+/−;mTmG+/tg) was co-stained for AT1 marker Pdpn. Maximum-intensity projections of confocal z stacks show that AT1 cells expressing the membrane-localized GFP reporter (green) also express Pdpn (white). Scale bar, 50 μm. f, Hierarchical clustering of 46 transgenically labelled mature Sftpc+ AT2 cells, isolated by FACS from adult mouse lung. Most genes identified as AT2 lineage-specific from single-cell transcriptomes at E18.5 are transcribed also by mature AT2 cells. In contrast, no or low expression is observed in mature AT2 cells for the genes specific to the other alveolar or bronchiolar lineages as identified from single-cell RNA-seq data at E18.5.

Extended Data Figure 5 Molecular profiles distinguish developmental intermediates during the differentiation of AT1 and AT2 cells from a common BP.

a, Hierarchical clustering of multiplexed qPCR gene expression data for 33 single cells from E16.5 lung epithelium (CD45/EpCAM+) suggests the presence at this time point of two major cell lineages, bronchiolar (cyan) and alveolar (brown) progenitors. Note that alveolar progenitors express a subset of both AT1 and AT2 marker genes. b, PCA of multiplexed qPCR data of lung epithelial cells at E16.5 identifies two gene groups in contrast to three observed at E18.5 (Fig. 1c). AT1 and AT2 specific marker genes do not segregate into distinct populations at E16.5. c, Hierarchical clustering of multiplexed qPCR gene expression data for 74 single embryonic lung epithelial cells (CD45/EpCAM+) at E18.5 shows multiple distinct cell populations consistent with RNA-sequencing data at this time point: BP, AT1, AT2, Clara and ciliated cells. Each row represents a single cell and each column a gene. Cells are clustered on the basis of expression of marker genes for alveolar and bronchiolar lineages (AT2: Abca3, Sftpb, Muc1, Lyz2, Sftpc; AT1: Aqp5, Pdpn, Ager; ciliated: Foxj1; Clara: Scgb1a1). d, PCA of multiplexed qPCR data replicates gene families found by single-cell RNA-seq at E18.5. Gene groups were characterized on the basis of differential correlation with the first two principal components. e, Developmental sequence of AT1 (orange) and AT2 (red) specification from a common BP (brown). Two and three maturation intermediates were identified in the specification process of AT2 and AT1 cell types, respectively, on the basis of the expression of known and previously unknown marker genes for both alveolar lineages measured by single-cell RNA-seq (Fig. 3). Transcription factors and receptors/ligands shown here were found to be expressed in BP cells and subsequently restricted to one of the alveolar lineages. Arrows, differentiation pathway; grey braces, change in transcript level of respective genes with tip pointing towards lower expression. fi, Protein level heterogeneity of alveolar epithelial markers during sacculation. f, Immunofluorescent micrograph from an E19.5 lung with mature AT1 and AT2 cells stained for their respective markers (Pdpn (white) and Ager (red) for AT1; Sftpc (green) for AT2). BPs are positive for all three markers. Cells in intermediate states are observed, such as early AT1 (Pdpn and Ager positive, Sftpc low) and early AT2 cells (Sftpc positive, and either Pdpn positve/Ager low or Pdpn low/Ager negative). Scale bar, 10 μm. g, Markers of late AT2 cells are expressed heterogeneously at E18.5. Immunofluorescence micrograph of a lung from a Lyz2–enhanced green fluorescent protein (eGFP) transgenic mouse, in which within the epithelium (E-cadherin, blue) only a subset of Sftpc (green)-positive AT2 cells are Lyz2 (red)-positive. Scale bar, 20 μm. h, Immunofluorescent staining of E18.5 lung tissue for Lamp3 (red) shows heterogeneous expression of Lamp3 in Sftpc-positive cells (green): Proximal cells show higher Lamp3 expression than distal cells. Blue, DAPI-stained nuclei. Scale bar, 20 μm. i, Immunofluorescent staining of E18.5 lung tissue for S100a6 (red) shows heterogeneous expression of the secreted protein S100a6 in Pdpn-positve cells (green). Blue, DAPI-stained nuclei. Scale bar, 20 μm.

Extended Data Figure 6 Following Sftpc-expressing cells throughout their life cycle.

a, Whole-mount in situ hybridizations of embryonic mouse lungs at E11.5, E13.5 and E14.5 using probes against Sftpc mRNA show expression of Sftpc specific to the tips of the epithelial tree branches. Moreover, variations in signal intensity indicate heterogeneity in the level of Sftpc expression across cells, which is in agreement with our single-cell RNA-seq data of Sftpc+ cells at E14.5 (see Fig. 4a). b, Diagram of the different transcriptional states in the specification of an AT2 cell as identified by single-cell RNA-seq of Sftpc+ cells from distal mouse lung epithelium of embryonic (E14.5, E16.5 and E18.5) and adult mice. The cell undergoes a transition from an early (A) and late (B) early progenitor state into a BP state before either taking the AT1 fate (nascent AT1), or following the AT2 pathway to become a nascent and finally a mature AT2 cell. Groups of genes turning on/up or off/down during the individual transitions are shown above and below each arrow, respectively (Fig. 4a and Supplementary Data). Whereas EP and BP cells are double positive for Sftpc and Pdpn, nascent and mature AT2 cells express Sftpc but turn off expression of the AT1 marker Pdpn. The developmental time points at which the individual cell states were detected, and their putative locations, are shown.

Extended Data Figure 7 The number of unique genes and the total number of transcripts expressed by a single cell strongly correlates with its differentiation state.

a, Saturation analysis of single-cell RNA-seq data of lung epithelial cells at different embryonic and adult time points (E14.5, E18.5 and adult AT2) reveals that the number of unique genes expressed by single lung epithelial cells decreases with progressing differentiation state. Distal lung epithelial cells at E14.5 express more than 6,000 genes, whereas cells at E18.5 express about 3,000 genes, and mature AT2 cells only about 2,000 genes. Each point on the saturation curve was generated by randomly selecting a number of raw reads from each sample library and then using the same alignment pipeline to call genes with a mean FPKM of more than 1. Each point represents four replicate subsamplings. Error bars represent s.e.m. All libraries were sequenced to a depth of at least 2 × 106 reads. b, Single-cell RNA-seq reveals that the total number of transcripts expressed by single cells decreases with increasing differentiation state of the cell. The number of transcripts per cell was calculated from the FPKM values of all genes in each cell, using the correlation between number of transcripts of exogenous spike-in mRNA sequences and their respective measured mean FPKM values (example calibration curves are shown in Extended Data Fig. 3c for three replicates at E18.5). Area-normalized density distributions are shown for embryonic cells at E14.5 (45 cells), E16.5 (27 cells) and E18.5 (80 cells), and for 46 Sftpc+ adult AT2 cells. The number of transcripts is highest in lung epithelial progenitor cells at E16.5 and E14.5 and decreases in cells at E18.5 and even further in mature AT2 cells. Note that single-cell RNA-seq libraries for E14.5, E18.5 and adult AT2 cells were sequenced to a depth of (2–6) × 106 reads, whereas the libraries for cells at E16.5 were sequenced to a lower depth of 100,000–550,000 reads. c, Calibration of Ct values measured by single-cell qPCR to number of molecules. Average detected transcript levels (log2Ex = Ct,LoD − Ct, Ct,LoD = 22) for six ERCC RNA spike-ins as a function of provided number of molecules per lysis reaction for each of three independent single-cell qPCR experiments performed on embryonic (E16.5, two replicates; red and green) and adult mouse lung (adult AT2, one replicate; blue). Linear regression fits through data points and corresponding equations are shown and were used to convert Ct values measured by qPCR into numbers of transcripts. d, Single-cell qPCR confirms the presence of a higher number of transcripts in lung epithelial progenitor cells in comparison with fully differentiated alveolar epithelial cells. The median number of transcripts per cell as detected by single-cell RNA-seq (y axis) and by single-cell multiplexed qPCR of 90 genes (x axis) is shown for distal lung epithelial cells at E16.5 (qPCR, 33 cells; RNA-seq, 27 cells) and mature AT2 cells (qPCR, 48 cells; RNA-seq, 46 cells).

Extended Data Figure 8 Transcriptional states during the early lifetime of the Clara cell lineage identified by single-cell RNA-seq of Scgb3a2+ cells at E14.5, E16.5 and E18.5.

a, Hierarchical clustering of 24 Scgb3a2-positive cells from distal mouse lung epithelium at different embryonic time points (E14.5, E16.5 and E18.5) based on the genes with highest principal-component loadings in an unbiased PCA analysis of all cells and all genes (shown in c). Cells are shown in rows, genes in columns. Cells cluster into three major groups. Scgb3a2 and Scgb1a1 transcript levels are shown in bars on the right. Whereas canonical Clara cell marker Scgb1a1 is first detected at E18.5, Scgb3a2 is detected as early as E14.5, suggesting that it is an early Clara cell marker. b, Gene Ontology (GO) enrichments of the three different gene clusters as well as transcription factors (TFs) belonging to the different groups of genes. c, PCA analysis of all Scgb3a2-positive cells and all genes identifies three different cell populations that were identified as bronchiolar progenitor as well as Clara and ciliated cells.

Supplementary information

Supplementary Tables

This file contains Supplementary Tables 1-2. (PDF 214 kb)

Supplementary Data 1

This file contains alignment statistics for all single cells with sequenced transcriptome. (XLSX 61 kb)

Supplementary Data 2

This zipped file contains R script used to analyze single cell RNA-seq data as .txt files. (ZIP 21 kb)

Supplementary Data 3

This file contains single cell RNA-seq expression data (log3(FPKM) values) for all 80 lung epithelial cells at E18.5 together with the putative cell type of each cell in a .txt file. (TXT 6859 kb)

Supplementary Data 4

This file contains listing of putative novel marker genes for bronchiolar and alveolar cell types identified by single cell transcriptome analysis together with correlation coefficients and p-values (Methods) as well as information regarding previous detection of each of these genes in cell types in the lung, available literature or known mouse knock-out phenotypes. (XLSX 154 kb)

Supplementary Data 5

This file contains the gene ontology and KEGG pathway enrichment analysis results of cell type specific genes for AT1, AT2, Clara and Ciliated cells as identified in the single cell RNA-seq data at E18.5 in an excel file. (XLSX 190 kb)

Supplementary Data 6

This file contains the genes identified by PCA to describe the variation in the data set of all Sftpc+ cells across 4 different time points together with Gene ontology enrichment analysis results for the different group of genes. (XLSX 569 kb)

PowerPoint slides

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Treutlein, B., Brownfield, D., Wu, A. et al. Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq. Nature 509, 371–375 (2014). https://doi.org/10.1038/nature13173

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature13173

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research
  NODES
INTERN 2
Note 4
Project 3
twitter 1