Tight Bounds and Approximations for Scan Statistic Probabilities for Discrete Data
Glaz, Joseph ; Naus, Joseph I.
Ann. Appl. Probab., Tome 1 (1991) no. 4, p. 306-318 / Harvested from Project Euclid
Let $X_1, X_2, \ldots$ be a sequence of independently and identically distributed integer-valued random variables. Let $Y_{t - m + 1,t}$ for $t = m, m + 1,\ldots$ denote a moving sum of $m$ consecutive $X_i$'s. Let $N_{m,T} = \max_{m \leq t \leq T} \{Y_{t - m + 1,t}\}$ and let $\tau_{k,m}$ be the waiting time until the moving sum of $X_i$'s in a scanning window of $m$ trials is as large as $k$. We derive tight bounds for the equivalent probabilities $P(\tau_{k,m} > T) = P(N_{m,T} < k)$. We apply the bounds for two problems in molecular biology: the distribution of the length of the longest almost-matching subsequence in aligned amino acid sequences and the distribution of the largest net charge within any $m$ consecutive positions in a charged alphabet string.
Publié le : 1991-05-14
Classification:  Longest matching subsequences,  scan statistics,  clustering probabilities,  60F10,  60F99
@article{1177005940,
     author = {Glaz, Joseph and Naus, Joseph I.},
     title = {Tight Bounds and Approximations for Scan Statistic Probabilities for Discrete Data},
     journal = {Ann. Appl. Probab.},
     volume = {1},
     number = {4},
     year = {1991},
     pages = { 306-318},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1177005940}
}
Glaz, Joseph; Naus, Joseph I. Tight Bounds and Approximations for Scan Statistic Probabilities for Discrete Data. Ann. Appl. Probab., Tome 1 (1991) no. 4, pp.  306-318. http://gdmltest.u-ga.fr/item/1177005940/