Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002 Jan 22;99(2):757-62.
doi: 10.1073/pnas.231608898.

Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome

Affiliations

Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome

Benjamin P Berman et al. Proc Natl Acad Sci U S A. .

Abstract

A major challenge in interpreting genome sequences is understanding how the genome encodes the information that specifies when and where a gene will be expressed. The first step in this process is the identification of regions of the genome that contain regulatory information. In higher eukaryotes, this cis-regulatory information is organized into modular units [cis-regulatory modules (CRMs)] of a few hundred base pairs. A common feature of these cis-regulatory modules is the presence of multiple binding sites for multiple transcription factors. Here, we evaluate the extent to which the tendency for transcription factor binding sites to be clustered can be used as the basis for the computational identification of cis-regulatory modules. By using published DNA binding specificity data for five transcription factors active in the early Drosophila embryo, we identified genomic regions containing unusually high concentrations of predicted binding sites for these factors. A significant fraction of these binding site clusters overlap known CRMs that are regulated by these factors. In addition, many of the remaining clusters are adjacent to genes expressed in a pattern characteristic of genes regulated by these factors. We tested one of the newly identified clusters, mapping upstream of the gap gene giant (gt), and show that it acts as an enhancer that recapitulates the posterior expression pattern of gt.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Distribution of predicted transcription factor binding sites and binding site clusters in the vicinity of eve. (A) Predicted high-affinity (P < 0.0003) binding sites for the transcription factors Bcd, Cad, Hb, Kr, and Kni in 1 Mb of genomic sequence surrounding the gene even-skipped (eve) are displayed as colored boxes. Blue boxes in the center of the panel represent positions of annotated exons, with eve highlighted in red. Binding sites and genes shown above the midline map to the forward DNA strand; those below the midline map to the reverse strand. (B) Sites from A that occur in 700-bp windows containing at least 13 predicted binding sites. (C) Expanded view of region containing all clusters in B, with positions of known eve enhancers marked with gray ellipses.
Figure 2
Figure 2
Binding site clusters identified as a function of binding site density. (A) Number of binding site clusters in 93 Mb of noncoding genomic DNA at varying densities. Number of clusters overlapping test CRMs is shown in blue. Number of additional clusters is shown in pink. (B) Sensitivity (test set CRMs recovered divided by total number of test set CRMs) is shown in blue. Specificity (test set CRMs recovered divided by total number of clusters identified) is shown in pink. It is important to note that these sensitivity and specificity measures are computed assuming that only previously known CRMs are true positives. Because there are almost certainly additional bona fide CRMs in this set, the actual specificities and sensitivities of the method are expected to be better. Dotted line indicates density level chosen for the exploration of novel clusters described in the text.
Figure 3
Figure 3
Expression patterns of selected genes flanking novel binding site clusters. We examined the expression patterns of 49 genes adjacent to one of the 28 novel binding site clusters described in Table 2 in syncytial and cellular blastoderm embryos (whole mount RNA in situ images are available in Table 2 (which is published as supporting information on the PNAS web site) and on the Berkeley Drosophila Genome Project website (http://www.fruitfly.org/). Eleven of these genes representing 10 clusters had early embryonic expression patterns characteristic of genes regulated by maternal and gap transcription factors and are shown here. §, References for flanking genes are as follows: gt (, , –40), otd (–43), btd (44, 45), pdm1 (46), pdm2 (46), Dfd (–49), Antp (49, 50), ftz (–53), odd (54), and psq (55)
Figure 4
Figure 4
Identification of a novel enhancer controlling posterior expression of giant. (A) Cluster of binding sites found between 2.9 Kb and 1.8 Kb upstream of giant. The DNA segment surrounding the cluster (labeled “posterior enhancer”) was cloned into a lacZ fusion construct and introduced into the genome via germline transformation as described in Materials and Methods. (B and C) Expression of giant in syncitial blastoderm stage embryos as determined by RNA in situ hybridization. B shows a wild-type embryo, and C shows a Kr1/Kr1 embryo lacking Krüppel (Kr) function. Without repression by Kr, the anterior border of the posterior expression domain shifts anteriorly. (D and E) Expression of lac Z in embryos containing construct from A. D shows a wild-type embryo, and E shows a Kr1/Kr1 embryo. Expression of the lacZ construct in the mutant embryo shows similar expansion to that seen in gt.

Comment in

Similar articles

Cited by

References

    1. Carroll S B, Grenier J K, Weatherbee S D. From DNA to Diversity: Molecular Genetics and the Evolution of Animal Design. Oxford: Blackwell Scientific; 2001.
    1. Harding K, Hoey T, Warrior R, Levine M. EMBO J. 1989;8:1205–1212. - PMC - PubMed
    1. Goto T, Macdonald P, Maniatis T. Cell. 1989;57:413–422. - PubMed
    1. Stanojevic D, Small S, Levine M. Science. 1991;254:1385–1387. - PubMed
    1. Small S, Blair A, Levine M. Dev Biol. 1996;175:314–324. - PubMed

Publication types

LinkOut - more resources

  NODES
Note 1
Project 1
twitter 2