Splicing language

From HandWiki

In mathematics and theoretical computer science, a splicing language is a formal language which formalizes the action of gene splicing in molecular biology. Splicing languages have a variety of definitions based on the form of splicing rules allowed, which describe how one may "cut" and "paste together" strings from the language to obtain new strings. In all of them, given an initial language I over a finite alphabet Σ and a set of splicing rules R, a splicing language is the smallest language containing I which is closed under applying any splicing rule rR.

The original definition of a splicing language was given by Head in 1987.[1] Later on, alternative and inequivalent definitions were provided by Păun[2] and Pixton.[3] The class of languages generated by Head splicing is strictly contained in that of those generated by Păun splicing, which is in turn strictly contained in that of those generated by Pixton splicing.[4]

Definition

The following definition is that of a Păun splicing system,[5] which is most common:

Let Σ be a finite alphabet and IΣ* a language. A splicing rule is a quadruple r=(u1,v1,u2,v2)(Σ*)4 (often written r=(u1,v1;u2,v2)). For w1,w2Σ* and a splicing rule r=(u1,v1;u2,v2)(Σ*)4, we write (w1,w2)rz if w1=x1u1v1y1, w2=x2u2v2y2, and z=x1u1v2y2. If R is a set of splicing rules over Σ, we say that σ=(Σ,R) is an H-scheme and define the action of σ on I to be σ(I)={zΣ*:w1,w2I s.t. (w1,w2)rz rR}. Now, inductively, let σ0(I)=I and σi+1(I)=σ(σi(I)). σ*(I)=i0+σi(I) is the splicing language generated by the H-system H=(Σ,I,R). That is, the smallest language containing I and closed under applications of any rR.

A rule set R is reflexive if (u1,v1;u2,v2)R implies that (u1,v1;u1,v1),(u2,v2;u2,v2)R. A rule set R is symmetric if (u1,v1;u2,v2)R implies that (u2,v2;u1,v1)R. A splicing language is called reflexive (resp. symmetric) if it is generated by a reflexive (resp. symmetric) H-system.

Results and Examples

A non-example of a splicing language is (aa)*, whereas b(aa)* is a splicing language. In fact, if L is a regular language on the alphabet Σ, and b is a letter not in Σ, then the language bL={bw:wL} is a splicing language.[6]

All splicing languages generated by a finite initial language and finite rule set are regular.[5]

It is decidable whether or not a regular language is a splicing language[7] and whether or not it is reflexive.[8] Both algorithms make use of the decidability of whether or not a splicing rule respects a regular language, meaning that the language is closed under splicing by that rule.

Every regular splicing language contains a constant, which is a word cΣ* such that u1cv1,u2cv2L implies that u1cv2,u2cv1L for any u1,v1,u2,v2Σ*.[9]

b(aa)*(aa)*b(aa)* is a reflexive splicing language which is not symmetric. It is also generated by a finite splicing system.[10]

a*ba*ba*a*ba*a* is a splicing language generated by a finite splicing system which is neither reflexive nor symmetric.[10]

References

  1. Head, T (1987). "Formal language theory and DNA: An analysis of the generative capacity of specific recombinant behaviors" (in en). Bulletin of Mathematical Biology 49 (6): 737–759. doi:10.1016/S0092-8240(87)90018-8. http://link.springer.com/10.1016/S0092-8240(87)90018-8. 
  2. Păun, Gheorghe; Rozenberg, Grzegorz; Salomaa, Arto (1996-11-20). "Computing by splicing". Theoretical Computer Science 168 (2): 321–336. doi:10.1016/S0304-3975(96)00082-5. ISSN 0304-3975. https://www.sciencedirect.com/science/article/pii/S0304397596000825. 
  3. Pixton, Dennis (1996-08-13). "Regularity of splicing languages". Discrete Applied Mathematics 69 (1): 101–124. doi:10.1016/0166-218X(95)00079-7. ISSN 0166-218X. https://dx.doi.org/10.1016/0166-218X%2895%2900079-7. 
  4. Bonizzoni, P.; Ferretti, C.; Mauri, G.; Zizza, R. (2001-09-30). "Separating some splicing models". Information Processing Letters 79 (6): 255–259. doi:10.1016/S0020-0190(01)00139-9. ISSN 0020-0190. https://www.sciencedirect.com/science/article/pii/S0020019001001399. 
  5. 5.0 5.1 Păun, Gheorghe; Rozenberg, Grzegorz; Salomaa, Arto (1998) (in en). DNA Computing. doi:10.1007/978-3-662-03563-4. ISBN 978-3-642-08388-4. https://link.springer.com/book/10.1007/978-3-662-03563-4. 
  6. Anderson, James A. (2006). Automata Theory with Modern Applications. Cambridge: Cambridge University Press. doi:10.1017/cbo9780511607202. ISBN 978-0-521-84887-9. https://www.cambridge.org/core/books/automata-theory-with-modern-applications/C209E9807ED2EBB5F072A9861D33CDF8. 
  7. Kari, Lila; Kopecki, Steffen (2017-03-01). "Deciding whether a regular language is generated by a splicing system". Journal of Computer and System Sciences 84: 263–287. doi:10.1016/j.jcss.2016.10.001. ISSN 0022-0000. https://www.sciencedirect.com/science/article/pii/S0022000016300940. 
  8. Head, Tom; Pixton, Dennis; Goode, Elizabeth (2003). "Splicing Systems: Regularity and Below". in Hagiya, Masami; Ohuchi, Azuma (in en). DNA Computing. Lecture Notes in Computer Science. 2568. Berlin, Heidelberg: Springer. pp. 262–268. doi:10.1007/3-540-36440-4_23. ISBN 978-3-540-36440-5. https://link.springer.com/chapter/10.1007/3-540-36440-4_23. 
  9. Bonizzoni, Paola; Jonoska, Nataša (2015-06-01). "Existence of constants in regular splicing languages". Information and Computation 242: 340–353. doi:10.1016/j.ic.2015.04.001. ISSN 0890-5401. PMID 27185985. 
  10. 10.0 10.1 Goode, Elizabeth; Pixton, Dennis (2007-04-15). "Recognizing splicing languages: Syntactic monoids and simultaneous pumping". Discrete Applied Mathematics 155 (8): 989–1006. doi:10.1016/j.dam.2006.10.006. ISSN 0166-218X. https://www.sciencedirect.com/science/article/pii/S0166218X06004562.