Myhill–Nerode theorem

From HandWiki
Revision as of 21:01, 6 February 2024 by Dennis Ross (talk | contribs) (fix)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Short description: Necessary and sufficient condition for a formal language to be regular

In the theory of formal languages, the Myhill–Nerode theorem provides a necessary and sufficient condition for a language to be regular. The theorem is named for John Myhill and Anil Nerode, who proved it at the University of Chicago in 1957 (Nerode Sauer).

Statement

Given a language [math]\displaystyle{ L }[/math], and a pair of strings [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math], define a distinguishing extension to be a string [math]\displaystyle{ z }[/math] such that exactly one of the two strings [math]\displaystyle{ xz }[/math] and [math]\displaystyle{ yz }[/math] belongs to [math]\displaystyle{ L }[/math]. Define a relation [math]\displaystyle{ \sim_L }[/math] on strings as [math]\displaystyle{ x\; \sim_L\ y }[/math] if there is no distinguishing extension for [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math]. It is easy to show that [math]\displaystyle{ \sim_L }[/math] is an equivalence relation on strings, and thus it divides the set of all strings into equivalence classes.

The Myhill–Nerode theorem states that a language [math]\displaystyle{ L }[/math] is regular if and only if [math]\displaystyle{ \sim_L }[/math] has a finite number of equivalence classes, and moreover, that this number is equal to the number of states in the minimal deterministic finite automaton (DFA) accepting [math]\displaystyle{ L }[/math]. Furthermore, every minimal DFA for the language is isomorphic to the canonical one (Hopcroft Ullman).

Myhill, Nerode (1957) — (1) [math]\displaystyle{ L }[/math] is regular if and only if [math]\displaystyle{ \sim_L }[/math] has a finite number of equivalence classes.

(2) This number is equal to the number of states in the minimal deterministic finite automaton (DFA) accepting [math]\displaystyle{ L }[/math].

(3) Any minimal DFA acceptor for the language is isomorphic to the following one:

Let each equivalence class [math]\displaystyle{ [x] }[/math] correspond to a state, and let state transitions be [math]\displaystyle{ a: [x] \to [xa] }[/math] for each [math]\displaystyle{ a\in \Sigma }[/math]. Let the starting state be [math]\displaystyle{ [\epsilon] }[/math], and the accepting states be [math]\displaystyle{ [x] }[/math] where [math]\displaystyle{ x\in L }[/math].

Generally, for any language, the constructed automaton is a state automaton acceptor. However, it does not necessarily have finitely many states. The Myhill–Nerode theorem shows that finiteness is necessary and sufficient for language regularity.

Some authors refer to the [math]\displaystyle{ \sim_L }[/math] relation as Nerode congruence,[1][2] in honor of Anil Nerode.

Use and consequences

The Myhill–Nerode theorem may be used to show that a language [math]\displaystyle{ L }[/math] is regular by proving that the number of equivalence classes of [math]\displaystyle{ \sim_L }[/math] is finite. This may be done by an exhaustive case analysis in which, beginning from the empty string, distinguishing extensions are used to find additional equivalence classes until no more can be found. For example, the language consisting of binary representations of numbers that can be divided by 3 is regular. Given the empty string, [math]\displaystyle{ 00 }[/math] (or [math]\displaystyle{ 11 }[/math]), [math]\displaystyle{ 01 }[/math], and [math]\displaystyle{ 10 }[/math] are distinguishing extensions resulting in the three classes (corresponding to numbers that give remainders 0, 1 and 2 when divided by 3), but after this step there is no distinguishing extension anymore. The minimal automaton accepting our language would have three states corresponding to these three equivalence classes.

Another immediate corollary of the theorem is that if for a language [math]\displaystyle{ L }[/math] the relation [math]\displaystyle{ \sim_L }[/math] has infinitely many equivalence classes, it is not regular. It is this corollary that is frequently used to prove that a language is not regular.

Generalizations

The Myhill–Nerode theorem can be generalized to tree automata.[3]

See also

References

  1. "Syntactic Complexity of Regular Ideals", Theory of Computing Systems 62 (5): 1175–1202, 2018, doi:10.1007/s00224-017-9803-8 
  2. Crochemore, Maxime et al. (2009), "From Nerode's congruence to suffix automata with mismatches", Theoretical Computer Science 410 (37): 3471–3480, doi:10.1016/j.tcs.2009.03.011 
  3. Hubert Comon; Max Dauchet; Rémi Gilleron; Florent Jacquemard; Denis Lugiez; Christoph Löding; Sophie Tison; Marc Tommasi (Oct 2021). Tree Automata Techniques and Applications (TATA). https://hal.inria.fr/hal-03367725.  Here: Sect. 1.5, p.35-36.

Further reading