Methods for calculating the probabilities of finding patterns in sequences
- PMID: 2720468
- DOI: 10.1093/bioinformatics/5.2.89
Methods for calculating the probabilities of finding patterns in sequences
Abstract
This paper describes the use of probability-generating functions for calculating the probabilities of finding motifs in nucleic acid and protein sequences. Equations and algorithms are given for calculating the probabilities associated with nine different ways of defining motifs. Comparisons are made with searches of random sequences. A higher level structure--the pattern--is defined as a list of motifs. A pattern also specifies the permitted ranges of spacing allowed between its constituent motifs. Equations for calculating the expected numbers of matches to patterns are given.
Similar articles
-
Methods to define and locate patterns of motifs in sequences.Comput Appl Biosci. 1988 Mar;4(1):53-60. doi: 10.1093/bioinformatics/4.1.53. Comput Appl Biosci. 1988. PMID: 2898280
-
Calculating the exact probability of language-like patterns in biomolecular sequences.Proc Int Conf Intell Syst Mol Biol. 1998;6:17-24. Proc Int Conf Intell Syst Mol Biol. 1998. PMID: 9783205
-
Software tools for motif and pattern scanning: program descriptions including a universal sequence reading algorithm.Comput Appl Biosci. 1989 Jul;5(3):227-32. doi: 10.1093/bioinformatics/5.3.227. Comput Appl Biosci. 1989. PMID: 2766008
-
PROMOT: a FORTRAN program to scan protein sequences against a library of known motifs.Comput Appl Biosci. 1991 Apr;7(2):257-60. doi: 10.1093/bioinformatics/7.2.257. Comput Appl Biosci. 1991. PMID: 2059852
-
Discovering sequence motifs.Methods Mol Biol. 2007;395:271-92. doi: 10.1007/978-1-59745-514-5_17. Methods Mol Biol. 2007. PMID: 17993680 Review.
Cited by
-
Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules.Algorithms Mol Biol. 2007 Oct 10;2:13. doi: 10.1186/1748-7188-2-13. Algorithms Mol Biol. 2007. PMID: 17927813 Free PMC article.
-
Bipartite pattern discovery by entropy minimization-based multiple local alignment.Nucleic Acids Res. 2004 Sep 23;32(17):4979-91. doi: 10.1093/nar/gkh825. Print 2004. Nucleic Acids Res. 2004. PMID: 15388800 Free PMC article.
-
Rhodopseudomonas palustris regulons detected by cross-species analysis of alphaproteobacterial genomes.Appl Environ Microbiol. 2005 Nov;71(11):7442-52. doi: 10.1128/AEM.71.11.7442-7452.2005. Appl Environ Microbiol. 2005. PMID: 16269786 Free PMC article.
-
Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements.BMC Bioinformatics. 2006 Sep 8;7:408. doi: 10.1186/1471-2105-7-408. BMC Bioinformatics. 2006. PMID: 16961919 Free PMC article.
-
Multi-stage analysis of gene expression and transcription regulation in C57/B6 mouse liver development.Genomics. 2009 Mar;93(3):235-42. doi: 10.1016/j.ygeno.2008.10.006. Epub 2008 Dec 10. Genomics. 2009. PMID: 19015022 Free PMC article.