Approximate word matches between two random sequences

Burden, Conrad J.; Kantorovitz, Miriam R.; Wilson, Susan R.

Burden, Conrad J. ; Kantorovitz, Miriam R. ; Wilson, Susan R.

Ann. Appl. Probab., Tome 18 (2008) no. 1, p. 1-21 / Harvested from Project Euclid

Résumé

Given two sequences over a finite alphabet $\mathcal{L}$ , the D₂ statistic is the number of m-letter word matches between the two sequences. This statistic is used in bioinformatics for expressed sequence tag database searches. Here we study a generalization of the D₂ statistic in the context of DNA sequences, under the assumption of strand symmetric Bernoulli text. For k

Publié le : 2008-02-15
Classification: DNA sequences, sequence comparison, word matches, 60F17, 92D20

@article{1199890013,
     author = {Burden, Conrad J. and Kantorovitz, Miriam R. and Wilson, Susan R.},
     title = {Approximate word matches between two random sequences},
     journal = {Ann. Appl. Probab.},
     volume = {18},
     number = {1},
     year = {2008},
     pages = { 1-21},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1199890013}
}

Burden, Conrad J.; Kantorovitz, Miriam R.; Wilson, Susan R. Approximate word matches between two random sequences. Ann. Appl. Probab., Tome 18 (2008) no. 1, pp.  1-21. http://gdmltest.u-ga.fr/item/1199890013/