Square-free word

In combinatorics, a squarefree word is a word (a sequence of symbols) that does not contain any squares. A square is a word of the form $XX$ , where $X$ is not empty. Thus, a squarefree word can also be defined as a word that avoids the pattern $XX$ .

Finite squarefree words

Binary alphabet

Over a binary alphabet [math]\displaystyle{ \{0,1\} }[/math], the only squarefree words are the empty word [math]\displaystyle{ \epsilon,0,1,01,10,010 }[/math], and [math]\displaystyle{ 101 }[/math].

Ternary alphabet

Over a ternary alphabet [math]\displaystyle{ \{0,1,2\} }[/math], there are infinitely many squarefree words. It is possible to count the number [math]\displaystyle{ c(n) }[/math] of ternary squarefree words of length $n$ .

The number of ternary squarefree words of length n^[1]
$n$	0	1	2	3	4	5	6	7	8	9	10	11	12
[math]\displaystyle{ c(n) }[/math]	1	3	6	12	18	30	42	60	78	108	144	204	264

This number is bounded by [math]\displaystyle{ c(n) = \Theta(\alpha^n) }[/math], where [math]\displaystyle{ 1.3017597 \lt \alpha \lt 1.3017619 }[/math].^[2] The upper bound on [math]\displaystyle{ \alpha }[/math] can be found via Fekete's Lemma and approximation by automata. The lower bound can be found by finding a substitution that preserves squarefreeness.^[2]

Alphabet with more than three letters

Since there are infinitely many squarefree words over three-letter alphabets, this implies there are also infinitely many squarefree words over an alphabet with more than three letters.

The following table shows the exact growth rate of the $k$ -ary squarefree words:

The growth rate of the $k$ -ary squarefree words^[2]
alphabet size ( $k$ )	4	5	6	7	8	9
growth rate	2.6215080	3.7325386	4.7914069	5.8284661	6.8541173	7.8729902
alphabet size ( $k$ )	10	11	12	13	14	15
growth rate	8.8874856	9.8989813	10.9083279	11.9160804	12.9226167	13.9282035

2-dimensional words

Consider a map [math]\displaystyle{ \textbf{w} }[/math] from [math]\displaystyle{ \mathbb{N}^2 }[/math] to $A$ , where $A$ is an alphabet and [math]\displaystyle{ \textbf{w} }[/math] is called a 2-dimensional word. Let [math]\displaystyle{ w_{m,n} }[/math] be the entry [math]\displaystyle{ \textbf{w}(m,n) }[/math]. A word [math]\displaystyle{ \textbf{x} }[/math] is a line of [math]\displaystyle{ \textbf{w} }[/math] if there exists [math]\displaystyle{ i_1,i_2,j_1, j_2 }[/math]such that [math]\displaystyle{ \text{gcd}(j_1, j_2) = 1 }[/math], and for [math]\displaystyle{ t \ge 0, x_t = w_{{i_1}+{j_1t},{i_2}+{j_2t}} }[/math].^[3]

Carpi^[4] proves that there exists a 2-dimensional word [math]\displaystyle{ \textbf{w} }[/math] over a 16-letter alphabet such that every line of [math]\displaystyle{ \textbf{w} }[/math] is squarefree. A computer search shows that there are no 2-dimensional words [math]\displaystyle{ \textbf{w} }[/math]over a 7-letter alphabet, such that every line of [math]\displaystyle{ \textbf{w} }[/math] is squarefree.

Generating finite squarefree words

Shur^[5] proposes an algorithm called R2F (random-t(w)o-free) that can generate a squarefree word of length $n$ over any alphabet with three or more letters. This algorithm is based on a modification of entropy compression: it randomly selects letters from a k-letter alphabet to generate a [math]\displaystyle{ (k+1) }[/math]-ary squarefree word.

algorithm R2F is
    input: alphabet size [math]\displaystyle{ k \ge 2 }[/math],
           word length [math]\displaystyle{ n \gt  1 }[/math]
    output: a [math]\displaystyle{ (k+1) }[/math]-ary squarefree word  $w$ of length  $n$ .

    (Note that [math]\displaystyle{ \color{gray}\Sigma_{k+1} }[/math] is the alphabet with letters [math]\displaystyle{ \color{gray}\{1,...,k+1\} }[/math].)
    (For a word [math]\displaystyle{ \color{gray}w \in \Sigma_k }[/math], [math]\displaystyle{ \color{gray}\chi_w }[/math] is the permutation of [math]\displaystyle{ \color{gray}\Sigma_k }[/math] such that  $a$  precedes  $b$  in [math]\displaystyle{ \color{gray}\chi_w }[/math] if the 
     right most position of  $a$  in  $w$  is to the right of the rightmost position of  $b$  in  $w$ .
     For example,  [math]\displaystyle{ \color{gray}w=136263163\in \Sigma_6 }[/math] has [math]\displaystyle{ \color{gray}\chi_w=361245 }[/math].)

    choose [math]\displaystyle{ w[1] }[/math] in [math]\displaystyle{ \Sigma_{k+1} }[/math] uniformly at random 
    set [math]\displaystyle{ \chi_w }[/math] to [math]\displaystyle{ w[1] }[/math] followed by all other letters of [math]\displaystyle{ \Sigma_{k+1} }[/math] in increasing order
    set the number  $N$  of iterations to 0

    while [math]\displaystyle{ |w| \lt  n }[/math] do
        choose  $j$  in [math]\displaystyle{ \Sigma_{k} }[/math] uniformly at random
        append [math]\displaystyle{ a = \chi_w[j+1] }[/math] to the end of  $w$ 
        update [math]\displaystyle{ \chi_w }[/math] shifting the first  $j$  elements to the right and setting [math]\displaystyle{ \chi_w[1] = a }[/math]
        increment  $N$  by  $1$ 
        if  $w$  ends with a square of rank  $r$  then
            delete the last  $r$  letters of  $w$ 

    return  $w$

Every (k+1)-ary squarefree word can be the output of Algorithm R2F, because on each iteration it can append any letter except for the last letter of $w$ .

The expected number of random k-ary letters used by Algorithm R2F to construct a [math]\displaystyle{ (k+1) }[/math]-ary squarefree word of length $n$ is[math]\displaystyle{ N=n\left(1 + \frac 2 {k^2} + \frac 1 {k^3} + \frac 4 {k^4} + O\left(\frac 1 {k^5}\right)\right)+O(1). }[/math]Note that there exists an algorithm that can verify the squarefreeness of a word of length $n$ in [math]\displaystyle{ O(n \log n) }[/math] time. Apostolico and Preparata^[6] give an algorithm using suffix trees. Crochemore^[7] uses partitioning in his algorithm. Main and Lorentz^[8] provide an algorithm based on the divide-and-conquer method. A naive implementation may require [math]\displaystyle{ O(n^2) }[/math] time to verify the squarefreeness of a word of length $n$ .

Infinite squarefree words

There exist arbitrarily long squarefree words in any alphabet with three or more letters, as proved by Axel Thue.^[9]

Examples

First difference of the Thue–Morse sequence

One example of an infinite squarefree word over an alphabet of size 3 is the word over the alphabet [math]\displaystyle{ \{-1,0,+1\} }[/math] obtained by taking the first difference of the Thue–Morse sequence.^[9] That is, from the Thue–Morse sequence

[math]\displaystyle{ 0, 1 ,1, 0 ,1 ,0 ,0 ,1, 1 ,0 ,0 ,1, 0 ,1 ,1, 0 ... }[/math]

one forms a new sequence in which each term is the difference of two consecutive terms of the Thue–Morse sequence. The resulting squarefree word is

[math]\displaystyle{ 1,0,-1,1,-1,0,1,0,-1,0,1,-1,1,0,-1,... }[/math](sequence A029883 in the OEIS).

Leech's morphism

Another example found by John Leech^[10] is defined recursively over the alphabet [math]\displaystyle{ \{0,1,2\} }[/math]. Let [math]\displaystyle{ w_1 }[/math] be any squarefree word starting with the letter 0. Define the words [math]\displaystyle{ \{w_i \mid i \in \mathbb{N} \} }[/math] recursively as follows: the word [math]\displaystyle{ w_{i+1} }[/math] is obtained from [math]\displaystyle{ w_i }[/math] by replacing each 0 in [math]\displaystyle{ w_i }[/math] with 0121021201210, each 1 with 1202102012021, and each 2 with 2010210120102. It is possible to prove that the sequence converges to the infinite squarefree word

0121021201210120210201202120102101201021202102012021...

Generating infinite squarefree words

Infinite squarefree words can be generated by squarefree morphism. A morphism is called squarefree if the image of every squarefree word is squarefree. A morphism is called k–squarefree if the image of every squarefree word of length k is squarefree.

Crochemore^[11] proves that a uniform morphism $h$ is squarefree if and only if it is 3-squarefree. In other words, $h$ is squarefree if and only if [math]\displaystyle{ h(w) }[/math] is squarefree for all squarefree $w$ of length 3. It is possible to find a squarefree morphism by brute-force search.

algorithm squarefree_morphism is
    output: a squarefree morphism with the lowest possible rank  $k$ .

    set [math]\displaystyle{ k = 3 }[/math]
    while True do
        set k_sf_words to the list of all squarefree words of length  $k$  over a ternary alphabet
        for each [math]\displaystyle{ h(0) }[/math] in k_sf_words do
            for each [math]\displaystyle{ h(1) }[/math] in k_sf_words do
                for each [math]\displaystyle{ h(2) }[/math] in k_sf_words do
                    if [math]\displaystyle{ h(1) = h(2) }[/math] then
                        break from the current loop (advance to next [math]\displaystyle{ h(1) }[/math])
                    if [math]\displaystyle{ h(0) \ne h(1) }[/math] and [math]\displaystyle{ h(2) \ne h(0) }[/math] then
                        if [math]\displaystyle{ h(w) }[/math] is squarefree for all squarefree  $w$  of length 3 then
                            return [math]\displaystyle{ h(0), h(1), h(2) }[/math]
        increment  $k$  by 1

Over a ternary alphabet, there are exactly 144 uniform squarefree morphisms of rank 11 and no uniform squarefree morphisms with a lower rank than 11.

To obtain an infinite squarefree words, start with any squarefree word such as 0, and successively apply a squarefree morphism $h$ to it. The resulting words preserve the property of squarefreeness. For example, let $h$ be a squarefree morphism, then as [math]\displaystyle{ w \to \infty }[/math], [math]\displaystyle{ h^{w}(0) }[/math] is an infinite squarefree word.

Note that, if a morphism over a ternary alphabet is not uniform, then this morphism is squarefree if and only if it is 5-squarefree.^[11]

Letter combinations in squarefree words

Extending a squarefree word to avoid

ab

.

Avoid two-letter combinations

Over a ternary alphabet, a squarefree word of length more than 13 contains all the squarefree two-letter combinations.^[12]

This can be proved by constructing a squarefree word without the two-letter combination $ab$ . As a result, $bcba$ $cbca$ $cbaca$ is the longest squarefree word without the combination $ab$ and its length is equal to 13.

Note that over a more than three-letter alphabet there are squarefree words of any length without an arbitrary two-letter combination.

Avoid three-letter combinations

Over a ternary alphabet, a squarefree word of length more than 36 contains all the squarefree three-letter combinations.^[12]

However, there are squarefree words of any length without the three-letter combination $aba$ .

Note that over a more than three-letter alphabet there are squarefree words of any length without an arbitrary three-letter combination.

Density of a letter

The density of a letter $a$ in a finite word $w$ is defined as [math]\displaystyle{ \frac{|w|_a}{|w|} }[/math] where [math]\displaystyle{ |w|_a }[/math] is the number of occurrences of $a$ in [math]\displaystyle{ w }[/math] and [math]\displaystyle{ |w| }[/math] is the length of the word. The density of a letter $a$ in an infinite word is [math]\displaystyle{ \liminf_{l\to \infty}\frac{|w_l|_a}{|w_l|} }[/math] where [math]\displaystyle{ w_l }[/math] is the prefix of the word $w$ of length $l$ .^[13]

The minimal density of a letter $a$ in an infinite ternary squarefree word is equal to [math]\displaystyle{ \frac{883}{3215} }[/math].^[13]

The maximum density of a letter $a$ in an infinite ternary squarefree word is equal to [math]\displaystyle{ \frac{255}{653} }[/math].^[14]

Notes

↑ "A006156 - OEIS". https://oeis.org/A006156.
↑ ^2.0 ^2.1 ^2.2 Shur, Arseny (2011). "Growth properties of power-free languages". Computer Science Review 6 (5–6): 28–43. doi:10.1016/j.cosrev.2012.09.001.
↑ Berthe, Valerie; Rigo, Michel, eds. (2016), "Preface", Combinatorics, Words and Symbolic Dynamics, Cambridge University Press, pp. xi–xviii, doi:10.1017/cbo9781139924733.001, ISBN 9781139924733
↑ Carpi, Arturo (1988). "Multidimensional unrepetitive configurations". Theoretical Computer Science 56 (2): 233–241. doi:10.1016/0304-3975(88)90080-1. ISSN 0304-3975.
↑ Shur, Arseny (2015). "Generating square-free words efficiently". Theoretical Computer Science 601: 67–72. doi:10.1016/j.tcs.2015.07.027.
↑ Apostolico, A.; Preparata, F.P. (Feb 1983). "Optimal off-line detection of repetitions in a string". Theoretical Computer Science 22 (3): 297–315. doi:10.1016/0304-3975(83)90109-3. ISSN 0304-3975.
↑ Crochemore, Max (Oct 1981). "An optimal algorithm for computing the repetitions in a word". Information Processing Letters 12 (5): 244–250. doi:10.1016/0020-0190(81)90024-7. ISSN 0020-0190.
↑ Main, Michael G; Lorentz, Richard J (Sep 1984). "An O(n log n) algorithm for finding all repetitions in a string". Journal of Algorithms 5 (3): 422–432. doi:10.1016/0196-6774(84)90021-x. ISSN 0196-6774.
↑ ^9.0 ^9.1 Berstel, Jean (1994). Axel Thue's papers on repetitions in words a translation. Départements de mathématiques et d'informatique, Université du Québec à Montréal. ISBN 978-2892761405. OCLC 494791187.
↑ Leech, J. (1957). "A problem on strings of beads". Math. Gaz. 41: 277–278. doi:10.1017/S0025557200236115.
↑ ^11.0 ^11.1 Berstel, Jean (April 1984). "Some Recent Results on Squarefree Words". Annual Symposium on Theoretical Aspects of Computer Science. Lecture Notes in Computer Science 166: 14–25. doi:10.1007/3-540-12920-0_2. ISBN 978-3-540-12920-2. http://www.numdam.org/item/PDML_1985___2B_21_0/.
↑ ^12.0 ^12.1 Zolotov, Boris (2015). "Another Solution to the Thue Problem of Non-Repeating Words". arXiv:1505.00019 [math.CO].
↑ ^13.0 ^13.1 Khalyavin, Andrey (2007). "The minimal density of a letter in an infinite ternary square-free word is 883/3215". Journal of Integer Sequences 10 (2): 3. Bibcode: 2007JIntS..10...65K. http://emis.de/journals/JIS/VOL10/Khalyavin/khalyavin13.pdf.
↑ Ochem, Pascal (2007). "Letter frequency in infinite repetition-free words". Theoretical Computer Science 380 (3): 388–392. doi:10.1016/j.tcs.2007.03.027. ISSN 0304-3975.

References

Berstel, Jean; Lauve, Aaron; Reutenauer, Christophe; Saliola, Franco V. (2009). Combinatorics on words. Christoffel words and repetitions in words. CRM Monograph Series. 27. Providence, RI: American Mathematical Society. ISBN 978-0-8218-4480-9. http://www.ams.org/bookpages/crmm-27.
Lothaire, M. (1997). Combinatorics on words. Cambridge: Cambridge University Press. ISBN 978-0-521-59924-5. .
Lothaire, M. (2011). Algebraic combinatorics on words. Encyclopedia of Mathematics and Its Applications. 90. With preface by Jean Berstel and Dominique Perrin (Reprint of the 2002 hardback ed.). Cambridge University Press. ISBN 978-0-521-18071-9.
Pytheas Fogg, N. (2002). Substitutions in dynamics, arithmetics and combinatorics. Lecture Notes in Mathematics. 1794. Berlin: Springer-Verlag. ISBN 978-3-540-44141-0.

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Square-free word. Read more

[1] "A006156 - OEIS". https://oeis.org/A006156.

[:1-2] 2.0 ^2.1 ^2.2 Shur, Arseny (2011). "Growth properties of power-free languages". Computer Science Review 6 (5–6): 28–43. doi:10.1016/j.cosrev.2012.09.001.

[3] Berthe, Valerie; Rigo, Michel, eds. (2016), "Preface", Combinatorics, Words and Symbolic Dynamics, Cambridge University Press, pp. xi–xviii, doi:10.1017/cbo9781139924733.001, ISBN 9781139924733

[4] Carpi, Arturo (1988). "Multidimensional unrepetitive configurations". Theoretical Computer Science 56 (2): 233–241. doi:10.1016/0304-3975(88)90080-1. ISSN 0304-3975.

[5] Shur, Arseny (2015). "Generating square-free words efficiently". Theoretical Computer Science 601: 67–72. doi:10.1016/j.tcs.2015.07.027.

[6] Apostolico, A.; Preparata, F.P. (Feb 1983). "Optimal off-line detection of repetitions in a string". Theoretical Computer Science 22 (3): 297–315. doi:10.1016/0304-3975(83)90109-3. ISSN 0304-3975.

[7] Crochemore, Max (Oct 1981). "An optimal algorithm for computing the repetitions in a word". Information Processing Letters 12 (5): 244–250. doi:10.1016/0020-0190(81)90024-7. ISSN 0020-0190.

[8] Main, Michael G; Lorentz, Richard J (Sep 1984). "An O(n log n) algorithm for finding all repetitions in a string". Journal of Algorithms 5 (3): 422–432. doi:10.1016/0196-6774(84)90021-x. ISSN 0196-6774.

[:4-9] 9.0 ^9.1 Berstel, Jean (1994). Axel Thue's papers on repetitions in words a translation. Départements de mathématiques et d'informatique, Université du Québec à Montréal. ISBN 978-2892761405. OCLC 494791187.

[10] Leech, J. (1957). "A problem on strings of beads". Math. Gaz. 41: 277–278. doi:10.1017/S0025557200236115.

[:0-11] 11.0 ^11.1 Berstel, Jean (April 1984). "Some Recent Results on Squarefree Words". Annual Symposium on Theoretical Aspects of Computer Science. Lecture Notes in Computer Science 166: 14–25. doi:10.1007/3-540-12920-0_2. ISBN 978-3-540-12920-2. http://www.numdam.org/item/PDML_1985___2B_21_0/.

[:2-12] 12.0 ^12.1 Zolotov, Boris (2015). "Another Solution to the Thue Problem of Non-Repeating Words". arXiv:1505.00019 [math.CO].

[:3-13] 13.0 ^13.1 Khalyavin, Andrey (2007). "The minimal density of a letter in an infinite ternary square-free word is 883/3215". Journal of Integer Sequences 10 (2): 3. Bibcode: 2007JIntS..10...65K. http://emis.de/journals/JIS/VOL10/Khalyavin/khalyavin13.pdf.

[14] Ochem, Pascal (2007). "Letter frequency in infinite repetition-free words". Theoretical Computer Science 380 (3): 388–392. doi:10.1016/j.tcs.2007.03.027. ISSN 0304-3975.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

Anonymous

Search

Square-free word

Namespaces

More

Page actions

Contents

Finite squarefree words

Binary alphabet

Ternary alphabet

Alphabet with more than three letters

2-dimensional words

Generating finite squarefree words

Infinite squarefree words

Examples

First difference of the Thue–Morse sequence

Leech's morphism

Generating infinite squarefree words

Letter combinations in squarefree words

Avoid two-letter combinations

Avoid three-letter combinations

Density of a letter

Notes

References

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Square-free word

Finite squarefree words

Binary alphabet

Ternary alphabet

Alphabet with more than three letters

2-dimensional words

Generating finite squarefree words

Infinite squarefree words

Examples

First difference of the Thue–Morse sequence

Leech's morphism

Generating infinite squarefree words

Letter combinations in squarefree words

Avoid two-letter combinations

Avoid three-letter combinations

Density of a letter

Notes

References

Navigation

Wiki tools

Page tools

Other projects

Categories