Given two sequences over a finite alphabet $\mathcal{L}$ , the D2 statistic is the number of m-letter word matches between the two sequences. This statistic is used in bioinformatics for expressed sequence tag database searches. Here we study a generalization of the D2 statistic in the context of DNA sequences, under the assumption of strand symmetric Bernoulli text. For k
Publié le : 2008-02-15
Classification:
DNA sequences,
sequence comparison,
word matches,
60F17,
92D20
@article{1199890013,
author = {Burden, Conrad J. and Kantorovitz, Miriam R. and Wilson, Susan R.},
title = {Approximate word matches between two random sequences},
journal = {Ann. Appl. Probab.},
volume = {18},
number = {1},
year = {2008},
pages = { 1-21},
language = {en},
url = {http://dml.mathdoc.fr/item/1199890013}
}
Burden, Conrad J.; Kantorovitz, Miriam R.; Wilson, Susan R. Approximate word matches between two random sequences. Ann. Appl. Probab., Tome 18 (2008) no. 1, pp. 1-21. http://gdmltest.u-ga.fr/item/1199890013/