+ Site Statistics
+ Search Articles
+ PDF Full Text Service
How our service works
Request PDF Full Text
+ Follow Us
Follow on Facebook
Follow on Twitter
Follow on LinkedIn
+ Subscribe to Site Feeds
Most Shared
PDF Full Text
+ Translate
+ Recently Requested

Compound poisson approximation of the number of occurrences of a position frequency matrix (PFM) on both strands



Compound poisson approximation of the number of occurrences of a position frequency matrix (PFM) on both strands



Journal of Computational Biology 15(6): 547-564



Transcription factors play a key role in gene regulation by interacting with specific binding sites or motifs. Therefore, enrichment of binding motifs is important for genome annotation and efficient computation of the statistical significance, the p-value, of the enrichment of motifs is crucial. We propose an efficient approximation to compute the significance. Due to the incorporation of both strands of the DNA molecules and explicit modeling of dependencies between overlapping hits, we achieve accurate results for any DNA motif based on its Position Frequency Matrix (PFM) representation. The accuracy of the p-value approximation is shown by comparison with the simulated count distribution. Furthermore, we compare the approach with a binomial approximation, (compound) Poisson approximation, and a normal approximation. In general, our approach outperforms these approximations or is equally good but significantly faster. An implementation of our approach is available at http://mosta.molgen.mpg.de.

Please choose payment method:






(PDF emailed within 0-6 h: $19.90)

Accession: 052269212

Download citation: RISBibTeXText

PMID: 18631020

DOI: 10.1089/cmb.2007.0084


Related references

Compound Poisson Approximation of the Number of Exceedances in Gaussian Sequences. Extremes 1(3): 295-321, 1999

Compound poisson and poisson process approximations for occurrences of multiple words in Markov chains. Journal Of Computational Biology. 5(2): 223-253, Summer, 1998

Compound Poisson Approximation to Convolutions of Compound Negative Binomial Variables. Methodology and Computing in Applied Probability 16(4): 951-968, 2014

A compound Poisson model for word occurrences in DNA sequences. Journal of the Royal Statistical Society: Series C 51(4): 437-451, 2002

A Compound Poisson Model for Word Occurrences in DNA Sequences. Journal of the Royal Statistical Society: Series C 51(4): 437-451, 2002

Normal and compound poisson approximations for pattern occurrences in NGS reads. Journal of Computational Biology 19(6): 839-854, 2012

Compound Poisson approximation and testing for gene clusters with multigene families. Journal of Computational Biology 18(4): 579-594, 2011

A Saddlepoint Approximation to the Distribution of Inhomogeneous Discounted Compound Poisson Processes. Methodology and Computing in Applied Probability 12(3): 533-551, 2010

Compound Poisson approximation for unbounded functions on a group, with application to large deviations. Probability Theory and Related Fields 103(4): 515-528, 1995

Poisson approximation for the number of visits to balls in non-uniformly hyperbolic dynamical systems. Ergodic Theory and Dynamical Systems 33(1): 49-80, 2013

Number of Jumps in Two-Sided First-Exit Problems for a Compound Poisson Process. Methodology and Computing in Applied Probability 18(3): 747-764, 2016

An improved compound Poisson model for the number of motif hits in DNA sequences. Bioinformatics 33(24): 3929-3937, 2017

A bayesian analysis for identifying DNA copy number variations using a compound poisson process. Eurasip Journal on Bioinformatics and Systems Biology 2010: 268513, 2010

Poisson approximation of the mixed Poisson distribution with infinitely divisible mixing law. Journal of Statistical Planning and Inference 140(1): 128-138, 2010

A nonuniform bound for the approximation of Poisson binomial by Poisson distribution. International Journal of Mathematics and Mathematical Sciences 2003(48): 3041-3046, 2003