Insertion and deletion are operations that occur commonly in DNA processing and RNA editing. Since biological macromolecules can be viewed as symbols, gene sequences can be represented as strings and structures can be interpreted as languages. This suggests that the bio-molecular structures that occur at different levels can be theoretically studied by formal languages. In the literature, there is no unique grammar formalism that captures various bio-molecular structures. To overcome this deficiency, in this paper, we introduce a simple grammar model called the matrix insertion-deletion system, and using it we model several bio-molecular structures that occur at the intramolecular, intermolecular and RNA secondary levels.
@article{bwmeta1.element.bwnjournal-article-amcv26i1p245bwm, author = {Lakshmanan Kuppusamy and Anand Mahendran}, title = {Modelling DNA and RNA secondary structures using matrix insertion-deletion systems}, journal = {International Journal of Applied Mathematics and Computer Science}, volume = {26}, year = {2016}, pages = {245-258}, zbl = {1336.92062}, language = {en}, url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-amcv26i1p245bwm} }
Lakshmanan Kuppusamy; Anand Mahendran. Modelling DNA and RNA secondary structures using matrix insertion-deletion systems. International Journal of Applied Mathematics and Computer Science, Tome 26 (2016) pp. 245-258. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-amcv26i1p245bwm/
[000] Boullier, P. and Sagot, B. (2011). Multi-component tree insertion grammars, in P. De Groote et al. (Eds.), Formal Grammar 2009, Lecture Notes in Artificial Intelligence, Vol. 5591, Springer, Berlin/Heidelberg, pp. 31-46. | Zbl 1325.68118
[001] Brendel, V. and Busse, H.G. (1984). Genome structure described by formal languages, Nucleic Acids Research 12(5): 2561-2568.
[002] Brown, M. and Wilson, C. (1995). RNA pseudoknot modelling using intersections of stochastic context free grammars with applications to database search, Proceedings of the Pacific Symposium on Biocomputing, Big Island, HI, USA, pp. 109-125.
[003] Cai, L., Russell, L. and Wu, Y. (2003). Stochastic modelling of RNA pseudoknotted structures: A grammatical approach, Bioinformatics 19(1): 66-73.
[004] Calude, C.S. and Pa˘un, Gh. (2001). Computing with Cells and Atoms: An Introduction to Quantum, DNA and Membrane Computing, Taylor and Francis, London.
[005] Chiang, D., Joshi, A.K. and Searls, D.B. (2006). Grammatical representations of macromolecular structure, Journal of Computational Biology 13(5): 1077-1100.
[006] Dong, S. and Searls, D.B. (1994). Gene structure prediction by linguistic methods, Genomics 23(3): 540-551.
[007] Dorigo, M. and Stutzle, T. (2004). Ant Colony Optimization, MIT Press, Cambridge, MA. | Zbl 1092.90066
[008] Durbin, R., Eddy, S., Krogh, A. and Mitchison, G. (1998). Biological Sequence Analysis, Cambridge University Press, Cambridge. | Zbl 0929.92010
[009] Eiben, A.E. and Smith, J.E. (2003). Introduction to Evolutionary Computing, Springer, Berlin/Heidelberg. | Zbl 1028.68022
[010] Galiukschov, B.S. (1981). Semicontextual grammars, Matematicheskaya Logika i Matematicheskaya Lingvistika: 38-50, (in Russian).
[011] Goldberg, E.D. (1989). Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, Boston, MA. | Zbl 0721.68056
[012] Haussler, D. (1982). Insertion and Iterated Insertion as Operations on Formal Languages, Ph.D. thesis, University of Colorado, Boulder, CO.
[013] Haussler, D. (1983). Insertion languages, Information Science 131(1): 77-89. | Zbl 0544.68049
[014] Head, T. (1987). Formal language theory and DNA: An analysis of the generative capacity of specific recombinant behaviors, Bulletin of Mathematical Biology 49(6): 737-750. | Zbl 0655.92008
[015] Kuppusamy, L., Mahendran, A. and Krishna, S.N. (2011a). Matrix insertion-deletion systems for bio-molecular structures, in R. Natarajan and A. Ojo (Eds.), ICDCIT2011, Lecture Notes in Computer Science, Vol. 6536, Springer, Berlin/Heidelberg, pp. 301-311.
[016] Kuppusamy, L., Mahendran, A. and Clergerie, E.V. (2011b). Modelling intermolecular structures and defining ambiguity in gene sequences using matrix insertion-deletion systems in biology, computation and linguistics, in G.B. Enguix et al. (Eds.), New Interdisciplinary Paradigms, IOS Press, Amsterdam, pp. 71-85.
[017] Lyngso, R.B., Zuker, M. and Pedersen, C.N.S. (1999). Internal loops in RNA secondary structure prediction, RECOMB99, Proceedings of the 3rd International Conference on Computational Molecular Biology, Lyon, France, pp. 260-267.
[018] Lyngso, R.B. and Pedersen, C.N.S. (2000). Pseudoknots in RNA secondary structure, RECOMB00, Proceedings of the 4th Annual International Conference on Computational Molecular Biology, Tokyo, Japan pp. 201-209.
[019] Mamitsuka, H. and Abe, N. (1994). Prediction of beta-sheet structures using stochastic tree grammars, Proceedings of the 5th Workshop on Genome Informatics, Yokohama, Japan, pp. 19-28.
[020] Pardo, M.A.A., Clergerie, E.V. and Ferro, M.V. (1997). Automata-based parsing in dynamic programming for LIG, in A.S. Narinyani (Ed.), Proceedings of the DIALOGUE'97 Computational Linguistics and Its Applications Workshop, Moscow, Russia, pp. 22-27.
[021] Păun, Gh., Rozenberg, G. and Salomaa, A. (1998). DNA Computing: New Computing Paradigms, Springer, Berlin/Heidelberg. | Zbl 0940.68053
[022] Păun, Gh. (2002). Membrane Computing: An Introduction, Springer, Berlin/Heidelberg. | Zbl 1034.68037
[023] Petre, I. and Verlan, S. (2012). Matrix insertion-deletion systems, Theoretical Computer Science 456: 80-88. | Zbl 1279.68087
[024] Rivas, E. and Eddy, S.R. (2000). The language of RNA: A formal grammar that includes pseudoknots, Bioinformatics 16(4): 334-340.
[025] Rozenberg, G. and Salomaa, A. (1997). Handbook of Formal Languages, Vol. 1, Springer, New York, NY. | Zbl 0866.68057
[026] Sakakibara, Y., Brown, R., Hughey, R., Mian, I.S., Sjolander, K., Underwood, R.C. and Haussler, D. (1996). Stochastic context-free grammars for tRNA modelling, Nucleic Acids Research 22(23): 5112-5120.
[027] Sakakibara, Y. (2003). Pair hidden Markov models on tree structures, Bioinformatics 19(1): 232-240.
[028] Searls, D.B. (1988). Representing genetic information with formal grammars, Proceedings of the National Conference on Artificial Intelligence, Saint Paul, MN, USA, pp. 386-391.
[029] Searls, D.B. (1992). The linguistics of DNA, American Scientist 80(6): 579-591.
[030] Searls, D.B. (1993). The computational linguistics of biological sequences, in L. Hunter (Ed.), Artificial Intelligence and Molecular Biology, AAAI Press, Paolo Alto, CA, pp. 47-120.
[031] Searls, D.B. (1995). Formal grammars for intermolecular structures, 1st International IEEE Symposium on Intelligence and Biological Systems, Washington, DC, USA, pp. 30-37.
[032] Searls, D.B. (2002). The language of genes, Nature 420(6912): 211-217.
[033] Theis, C., Janssen, S. and Giegerich, R. (2010). Prediction of RNA secondary structure including kissing hairpin motifs, Proceedings of WABI 2010, Liverpool, UK, pp. 52-64.
[034] Uemura, Y, Hasegawa, A., Kobayashi, S. and Yokomori, T. (1999). Tree adjoining grammars for RNA structure prediction, Theoretical Computer Science 210(2): 277-303. | Zbl 0912.68121
[035] Yuki, S. and Kasami, T. (2006). RNA pseudoknotted structure prediction using stochastic multiple context-free grammar, IPSJ Transactions on Bioinformatics 47: 12-21.