Unavoidable pattern

From HandWiki

In mathematics and theoretical computer science, a pattern is an unavoidable pattern if it is unavoidable on any finite alphabet.

Definitions

Pattern

Like a word, a pattern (also called term) is a sequence of symbols over some alphabet.

The minimum multiplicity of the pattern p is m(p)=min(countp(x):xp) where countp(x) is the number of occurrence of symbol x in pattern p. In other words, it is the number of occurrences in p of the least frequently occurring symbol in p.

Instance

Given finite alphabets Σ and Δ, a word xΣ* is an instance of the pattern pΔ* if there exists a non-erasing semigroup morphism f:Δ*Σ* such that f(p)=x, where Σ* denotes the Kleene star of Σ. Non-erasing means that f(a)ε for all aΔ, where ε denotes the empty string.

Avoidance / Matching

A word w is said to match, or encounter, a pattern p if a factor (also called subword or substring) of w is an instance of p. Otherwise, w is said to avoid p, or to be p-free. This definition can be generalized to the case of an infinite w, based on a generalized definition of "substring".

Avoidability / Unavoidability on a specific alphabet

A pattern p is unavoidable on a finite alphabet Σ if each sufficiently long word xΣ* must match p; formally: if nN. xΣ*. (|x|nx matches p). Otherwise, p is avoidable on Σ, which implies there exist infinitely many words over the alphabet Σ that avoid p.

By Kőnig's lemma, pattern p is avoidable on Σ if and only if there exists an infinite word wΣω that avoids p.[1]

Maximal p-free word

Given a pattern p and an alphabet Σ. A p-free word wΣ* is a maximal p-free word over Σ if aw and wa match p aΣ.

Avoidable / Unavoidable pattern

A pattern p is an unavoidable pattern (also called blocking term) if p is unavoidable on any finite alphabet.

If a pattern is unavoidable and not limited to a specific alphabet, then it is unavoidable for any finite alphabet by default. Conversely, if a pattern is said to be avoidable and not limited to a specific alphabet, then it is avoidable on some finite alphabet by default.

k-avoidable / k-unavoidable

A pattern p is k-avoidable if p is avoidable on an alphabet Σ of size k. Otherwise, p is k-unavoidable, which means p is unavoidable on every alphabet of size k.[2]

If pattern p is k-avoidable, then p is g-avoidable for all gk.

Given a finite set of avoidable patterns S={p1,p2,...,pi}, there exists an infinite word wΣω such that w avoids all patterns of S.[1] Let μ(S) denote the size of the minimal alphabet Σsuch that wΣω avoiding all patterns of S.

Avoidability index

The avoidability index of a pattern p is the smallest k such that p is k-avoidable, and if p is unavoidable.[1]

Properties

  • A pattern q is avoidable if q is an instance of an avoidable pattern p.[3]
  • Let avoidable pattern p be a factor of pattern q, then q is also avoidable.[3]
  • A pattern q is unavoidable if and only if q is a factor of some unavoidable pattern p.
  • Given an unavoidable pattern p and a symbol a not in p, then pap is unavoidable.[3]
  • Given an unavoidable pattern p, then the reversal pR is unavoidable.
  • Given an unavoidable pattern p, there exists a symbol a such that a occurs exactly once in p.[3]
  • Let nN represent the number of distinct symbols of pattern p. If |p|2n, then p is avoidable.[3]

Zimin words

Given alphabet Δ={x1,x2,...}, Zimin words (patterns) are defined recursively Zn+1=Znxn+1Zn for nZ+ and Z1=x1.

Unavoidability

All Zimin words are unavoidable.[4]

A word w is unavoidable if and only if it is a factor of a Zimin word.[4]

Given a finite alphabet Σ, let f(n,|Σ|) represent the smallest mZ+ such that w matches Zn for all wΣm. We have following properties:[5]

  • f(1,q)=1
  • f(2,q)=2q+1
  • f(3,2)=29
  • f(n,q)n1q=qqqn1

Zn is the longest unavoidable pattern constructed by alphabet Δ={x1,x2,...,xn} since |Zn|=2n1.

Pattern reduction

Free letter

Given a pattern p over some alphabet Δ, we say xΔ is free for p if there exist subsets A,B of Δ such that the following hold:

  1. uv is a factor of p and uAuv is a factor of p and vB
  2. xABBA

For example, let p=abcbab, then b is free for p since there exist A=ac,B=b satisfying the conditions above.

Reduce

A pattern pΔ* reduces to pattern q if there exists a symbol xΔ such that x is free for p, and q can be obtained by removing all occurrence of x from p. Denote this relation by pxq.

For example, let p=abcbab, then p can reduce to q=aca since b is free for p.

Locked

A word w is said to be locked if w has no free letter; hence w can not be reduced.[6]

Transitivity

Given patterns p,q,r, if p reduces to q and q reduces to r, then p reduces to r. Denote this relation by p*r.

Unavoidability

A pattern p is unavoidable if and only if p reduces to a word of length one; hence w such that |w|=1 and p*w.[7][4]

Graph pattern avoidance[8]

Avoidance / Matching on a specific graph

Given a simple graph G=(V,E), a edge coloring c:EΔ matches pattern p if there exists a simple path P=[e1,e2,...,er] in G such that the sequence c(P)=[c(e1),c(e2),...,c(er)] matches p. Otherwise, c is said to avoid p or be p-free.

Similarly, a vertex coloring c:VΔ matches pattern p if there exists a simple path P=[c1,c2,...,cr] in G such that the sequence c(P) matches p.

Pattern chromatic number

The pattern chromatic number πp(G) is the minimal number of distinct colors needed for a p-free vertex coloring c over the graph G.

Let πp(n)=max{πp(G):GGn} where Gn is the set of all simple graphs with a maximum degree no more than n.

Similarly, πp(G) and πp(n) are defined for edge colorings.

Avoidability / Unavoidability on graphs

A pattern p is avoidable on graphs if πp(n) is bounded by cp, where cp only depends on p.

  • Avoidance on words can be expressed as a specific case of avoidance on graphs; hence a pattern p is avoidable on any finite alphabet if and only if πp(Pn)cp for all nZ+, where Pn is a graph of n vertices concatenated.

Probabilistic bound on πp(n)

There exists an absolute constant c, such that πp(n)cnm(p)m(p)1cn2 for all patterns p with m(p)2.[8]

Given a pattern p, let n represent the number of distinct symbols of p. If |p|2n, then p is avoidable on graphs.

Explicit colorings

Given a pattern p such that countp(x) is even for all xp, then πp(K2k)2k1 for all k1, where Kn is the complete graph of n vertices.[8]

Given a pattern p such that m(p)2, and an arbitrary tree T, let S be the set of all avoidable subpatterns and their reflections of p. Then πp(T)3μ(S).[8]

Given a pattern p such that m(p)2, and a tree T with degree n2. Let S be the set of all avoidable subpatterns and their reflections of p, then πp(T)2(n1)μ(S).[8]

Examples

  • The Thue–Morse sequence is cube-free and overlap-free; hence it avoids the patterns xxx and xyxyx.[2]
  • A square-free word is one avoiding the pattern xx. The word over the alphabet {0,±1} obtained by taking the first difference of the Thue–Morse sequence is an example of an infinite square-free word.[9][10]
  • The patterns x and xyx are unavoidable on any alphabet, since they are factors of the Zimin words.[11][1]
  • The power patterns xn for n3 are 2-avoidable.[1]
  • All binary patterns can be divided into three categories:[1]
    • ε,x,xyx are unavoidable.
    • xx,xxy,xyy,xxyx,xxyy,xyxx,xyxy,xyyx,xxyxx,xxyxy,xyxyy have avoidability index of 3.
    • others have avoidability index of 2.
  • abwbaxbcyaczca has avoidability index of 4, as well as other locked words.[6]
  • abvacwbaxbcycdazdcd has avoidability index of 5.[12]
  • The repetitive threshold RT(n) is the infimum of exponents k such that xk is avoidable on an alphabet of size n. Also see Dejean's theorem.

Open problems

  • Is there an avoidable pattern p such that the avoidability index of p is 6?
  • Given an arbitrarily pattern p, is there an algorithm to determine the avoidability index of p?[1]

References

  1. 1.0 1.1 1.2 1.3 1.4 1.5 1.6 Lothaire, M. (2002). Algebraic Combinatorics on Words. Cambridge University Press. ISBN 9780521812207. https://archive.org/details/algebraiccombina0000loth. 
  2. 2.0 2.1 (in en) Combinatorics on Words: Christoffel Words and Repetitions in Words. American Mathematical Soc.. pp. 127. ISBN 978-0-8218-7325-0. https://books.google.com/books?id=3w9eR3u8GN4C&q=Combinatorics+on+words.+Christoffel+words+and+repetitions+in+words+2009. 
  3. 3.0 3.1 3.2 3.3 3.4 Schmidt, Ursula (1987-08-01). "Long unavoidable patterns" (in en). Acta Informatica 24 (4): 433–445. doi:10.1007/BF00292112. ISSN 1432-0525. 
  4. 4.0 4.1 4.2 Zimin, A. I. (1984). "Blocking Sets of Terms" (in en). Mathematics of the USSR-Sbornik 47 (2): 353–364. doi:10.1070/SM1984v047n02ABEH002647. ISSN 0025-5734. Bibcode1984SbMat..47..353Z. 
  5. Joshua, Cooper; Rorabaugh, Danny (2013). Bounds on Zimin Word Avoidance. Bibcode2014arXiv1409.3080C. https://archive.org/details/arxiv-1409.3080. 
  6. 6.0 6.1 Baker, Kirby A.; McNulty, George F.; Taylor, Walter (1989-12-18). "Growth problems for avoidable words" (in en). Theoretical Computer Science 69 (3): 319–345. doi:10.1016/0304-3975(89)90071-6. ISSN 0304-3975. 
  7. Bean, Dwight R.; Ehrenfeucht, Andrzej; McNulty, George F. (1979). "Avoidable patterns in strings of symbols." (in en). Pacific Journal of Mathematics 85 (2): 261–294. doi:10.2140/pjm.1979.85.261. ISSN 0030-8730. https://projecteuclid.org/euclid.pjm/1102783913. 
  8. 8.0 8.1 8.2 8.3 8.4 Grytczuk, Jarosław (2007-05-28). "Pattern avoidance on graphs" (in en). Discrete Mathematics. The Fourth Caracow Conference on Graph Theory 307 (11): 1341–1346. doi:10.1016/j.disc.2005.11.071. ISSN 0012-365X. 
  9. (in en) Combinatorics on Words: Christoffel Words and Repetitions in Words. American Mathematical Soc.. pp. 97. ISBN 978-0-8218-7325-0. https://books.google.com/books?id=3w9eR3u8GN4C&q=Combinatorics+on+words.+Christoffel+words+and+repetitions+in+words+2009. 
  10. Fogg, N. Pytheas (2002-09-23) (in en). Substitutions in Dynamics, Arithmetics and Combinatorics. Springer Science & Business Media. pp. 104. ISBN 978-3-540-44141-0. https://books.google.com/books?id=qBIsuwEACAAJ&q=Substitutions+in+dynamics,+arithmetics+and+combinatorics.. 
  11. Allouche, Jean-Paul; Shallit, Jeffrey; Shallit, Professor Jeffrey (2003-07-21) (in en). Automatic Sequences: Theory, Applications, Generalizations. Cambridge University Press. pp. 24. ISBN 978-0-521-82332-6. https://books.google.com/books?id=2ZsSUStt96sC&dq=Automatic+Sequences%3A+Theory%2C+Applications%2C+Generalizations&pg=PR13. 
  12. Clark, Ronald J. (2006-04-01). "The existence of a pattern which is 5-avoidable but 4-unavoidable". International Journal of Algebra and Computation 16 (2): 351–367. doi:10.1142/S0218196706002950. ISSN 0218-1967. 
  • Allouche, Jean-Paul; Shallit, Jeffrey (2003). Automatic Sequences: Theory, Applications, Generalizations. Cambridge University Press. ISBN 978-0-521-82332-6. 
  • Berstel, Jean; Lauve, Aaron; Reutenauer, Christophe; Saliola, Franco V. (2009). Combinatorics on words. Christoffel words and repetitions in words. CRM Monograph Series. 27. Providence, RI: American Mathematical Society. ISBN 978-0-8218-4480-9. 
  • Lothaire, M. (2011). Algebraic combinatorics on words. Encyclopedia of Mathematics and Its Applications. 90. With preface by Jean Berstel and Dominique Perrin (Reprint of the 2002 hardback ed.). Cambridge University Press. ISBN 978-0-521-18071-9. 
  • Pytheas Fogg, N. (2002). Berthé, Valérie; Ferenczi, Sébastien; Mauduit, Christian et al.. eds. Substitutions in dynamics, arithmetics and combinatorics. Lecture Notes in Mathematics. 1794. Berlin: Springer-Verlag. ISBN 3-540-44141-7.