Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Sep 26:2024.09.24.614814.
doi: 10.1101/2024.09.24.614814.

Cell-type-specific mapping of enhancers and _target genes from single-cell multimodal data

Affiliations

Cell-type-specific mapping of enhancers and _target genes from single-cell multimodal data

Chang Su et al. bioRxiv. .

Abstract

Mapping enhancers and _target genes in disease-related cell types has provided critical insights into the functional mechanisms of genetic variants identified by genome-wide association studies (GWAS). However, most existing analyses rely on bulk data or cultured cell lines, which may fail to identify cell-type-specific enhancers and _target genes. Recently, single-cell multimodal data measuring both gene expression and chromatin accessibility within the same cells have enabled the inference of enhancer-gene pairs in a cell-type-specific and context-specific manner. However, this task is challenged by the data's high sparsity, sequencing depth variation, and the computational burden of analyzing a large number of enhancer-gene pairs. To address these challenges, we propose scMultiMap, a statistical method that infers enhancer-gene association from sparse multimodal counts using a joint latent-variable model. It adjusts for technical confounding, permits fast moment-based estimation and provides analytically derived p -values. In systematic analyses of blood and brain data, scMultiMap shows appropriate type I error control, high statistical power with greater reproducibility across independent datasets and stronger consistency with orthogonal data modalities. Meanwhile, its computational cost is less than 1% of existing methods. When applied to single-cell multimodal data from postmortem brain samples from Alzheimer's disease (AD) patients and controls, scMultiMap gave the highest heritability enrichment in microglia and revealed new insights into the regulatory mechanisms of AD GWAS variants in microglia.

Keywords: cell-type-specific analysis; enhancer; latent-variable model; single-cell multimodal data.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Performance evaluation of scMultiMap, SCENT and Signac using single-cell multimodal data on PBMC from 10x Genomics [21, 22, 23, 26]. A. Empirical type I errors on null data with independent gene expression and peak accessibility levels. The dashed line marks the nominal level of 0.05. B. Precision-recall curves on simulated data, with the same color legend as in A. C. Computing time in hours (log scale) on a dataset with 729 cells and 31,132 candidate peak-gene pairs.
Figure 2:
Figure 2:
Comparison of reproducibility across methods and validation of regulatory trios inferred by scMultiMap. A. Number of significant peak-gene pairs reproduced between biological replicates and between technical replicates of single-cell multimodal data on CD14 monocytes across varying BH-adjusted p-values cutoffs. See B for the color legend. B. Consistency of scMultiMap findings with orthogonal data modalities. Significant peak-genes pairs on CD14 monocytes (BH-adjusted p-value < 0.1) were compared against enhancer-gene pairs measured by promoter capture Hi-C [18], H3K27ac HiChIP [19] and cell-type-specific eQTLs [27] (Methods). In A-B, statistical significance of the overlapped counts is evaluated with one-sided Fisher exact tests (−log p values are shown). C. Enrichment of GO biological processes among the _target genes in trios identified for each TF in CD14 monocytes.
Figure 3:
Figure 3:
Results from scMultiMap on single-cell multimodal data from postmortem brain samples in [32]. A. Empirical type I errors on permuted data. Normalized peak values were randomly permuted within subject to break dependency with gene expressions while preserving variations among subjects. The dashed line marks the nominal level of 0.05. B. Precision-recall curves on simulated data. See color legend in A. C. Consistency of significant pairs (BH-adjusted p-value < 0.2) with enhancer-gene pairs measured by PLAC-seq [3] in excitatory neurons (Exc), inhibitory neurons (Inh) and oligodendrocytes (Oli) (Methods). See color legend in A. D-E. Reproducibility of significant pairs with independent single-cell multimodal data on brain samples from [33] across cutoffs of BH-adjusted p-values, as evaluated by the enrichment (D) and the number (E) of reproduced counts. In C-E, enrichment is quantified by odds ratio (OR) and log OR in C and D respectively (OR=0 not shown), and p-values are from one-sided Fisher exact tests. F-G. Enrichment of GO biological processes among the _target genes in trios identified for each TF in excitatory neurons (F) and astrocytes (G). Color intensity is given by BH-adjusted −log10 p-values from one-sided Fisher exact tests (values larger than 10 were set to 10).
Figure 4:
Figure 4:
Studying the functional role of selective AD GWAS variants in microglia with scMultiMap. A. AD heritability enrichment for peaks from significant peak-gene pairs (raw p-value < 0.05) in microglia. Summary statistics from three AD GWAS studies were used: Jansen [44], Kunkle [45] and Bellenguez [41].* and ** denote p-value < 0.1 and 0.01 respectively for one-sided p-values of heritability enrichment from S-LDSC. B. Differential peak-gene pairs in microglia from control and AD subjects. C. Enrichment of GO biological processes among the genes from significantly differential peak-gene pairs. BH-adjusted p-values from one-sided Fisher exact tests are shown. D. scMultiMap mapped AD variant rs10792831 to PICALM in microglia from control subjects and the association is insignificant in microglia from AD subjects. E. scMultiMap mapped AD variant rs4075111 to INPP5D in microglia from control subjects and the association is insignificant in microglia from AD subjects.

Similar articles

References

    1. Maurano M. T. et al. Systematic localization of common disease-associated variation in regulatory dna. Science 337, 1190–1195 (2012). - PMC - PubMed
    1. Mostafavi H., Spence J. P., Naqvi S. & Pritchard J. K. Systematic differences in discovery of genetic effects on gene expression and complex traits. Nature Genetics 55, 1866–1875 (2023). - PubMed
    1. Nott A. et al. Brain cell type–specific enhancer–promoter interactome maps and disease-risk association. Science 366, 1134–1139 (2019). - PMC - PubMed
    1. Pennacchio L. A., Bickmore W., Dean A., Nobrega M. A. & Bejerano G. Enhancers: five essential questions. Nature Reviews Genetics 14, 288–295 (2013). - PMC - PubMed
    1. Gasperini M., Tome J. M. & Shendure J. Towards a comprehensive catalogue of validated and _target-linked human enhancers. Nature Reviews Genetics 21, 292–310 (2020). - PMC - PubMed

Publication types

LinkOut - more resources

  NODES
Association 5
Note 1
twitter 2