Critical Phenomena for Sequence Matching with Scoring
Dembo, Amir ; Karlin, Samuel ; Zeitouni, Ofer
Ann. Probab., Tome 22 (1994) no. 4, p. 1993-2021 / Harvested from Project Euclid
Consider two independent sequences $X_1,\ldots, X_n$ and $Y_1,\ldots, Y_n$. Suppose that $X_1,\ldots, X_n$ are i.i.d. $\mu X$ and $Y_1,\ldots, Y_n$ are i.i.d. $\mu_Y$, where $\mu_X$ and $\mu_Y$ are distributions on finite alphabets $\sum_X$ and $\sum_Y$, respectively. A score $F: \sum_X \times \sum_Y \rightarrow \mathbb{R}$ is assigned to each pair $(X_i, Y_j)$ and the maximal nonaligned segment score is $M_n = \max_{0\leq i, j \leq n - \Delta, \Delta \geq 0}\{\sum^\Delta_{l=1}F(X_{i+l}, Y_{j+l})\}$. Our result is that $M_n/\log n \rightarrow \gamma^\ast(\mu_X, \mu_Y)$ a.s. with $\gamma^\ast$ determined by a tractable variational formula. Moreover, the pair empirical measure of $(X_{i+l}, Y_{j+l})$ during the segment where $M_n$ is achieved converges to a probability measure $\nu^\ast$, which is accessible by the same formula. These results generalize to $X_i, Y_j$ taking values in any Polish space, to intrasequence scores under shifts, to long quality segments and to more than two sequences.
Publié le : 1994-10-14
Classification:  Large deviations,  strong laws,  sequence matching,  large segmental sums,  60F10,  60F15
@article{1176988492,
     author = {Dembo, Amir and Karlin, Samuel and Zeitouni, Ofer},
     title = {Critical Phenomena for Sequence Matching with Scoring},
     journal = {Ann. Probab.},
     volume = {22},
     number = {4},
     year = {1994},
     pages = { 1993-2021},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1176988492}
}
Dembo, Amir; Karlin, Samuel; Zeitouni, Ofer. Critical Phenomena for Sequence Matching with Scoring. Ann. Probab., Tome 22 (1994) no. 4, pp.  1993-2021. http://gdmltest.u-ga.fr/item/1176988492/