Straight-line grammar

From HandWiki

A straight-line grammar (sometimes abbreviated as SLG) is a formal grammar that generates exactly one string.[1] Consequently, it does not branch (every non-terminal has only one associated production rule) nor loop (if non-terminal A appears in a derivation of B, then B does not appear in a derivation of A).[1]

Areas of usefulness

Straight-line grammars are widely used in the development of algorithms that execute directly on compressed structures (without prior decompression).[2]:212

SLGs are of interest in fields like Kolmogorov complexity, Lossless data compression, Structure discovery and Compressed data structures.[clarification needed]

The problem of finding a context-free grammar (equivalently: an SLG) of minimal size that generates a given string is called the smallest grammar problem.[citation needed]

Straight-line grammars (more precisely: straight-line context-free string grammars) can be generalized to Straight-line context-free tree grammars. The latter can be used conveniently to compress trees.[2]:212

Formal Definition

A context-free grammar G is an SLG if:

1. for every non-terminal N, there is at most one production rule that has N as its left-hand side, and

2. the directed graph G=<V,E>, defined by V being the set of non-terminals and (A,B) ∈ E whenever B appears at the right-hand side of a production rule for A, is acyclic.

A mathematical definition of the more general formalism of straight-line context-free tree grammars can be found in Lohrey et al.[2]:215

An SLG in Chomsky normal form is equivalent to a straight-line program.[citation needed]

A list of algorithms using SLGs

  • The Sequitur algorithm constructs a straight-line grammar for a given string.
  • The Lempel-Ziv-Welch algorithm creates a context-free grammar in such a deterministic way that it is necessary to store only the start rule of the generated grammar.
  • Byte pair encoding

See also

  • Grammar-based code – Lossless data compression algorithm
  • Non-recursive grammar - a grammar that does not loop, but may branch; generating a finite rather than a singleton language

References

  1. 1.0 1.1 Florian Benz and Timo Kötzing, “An effective heuristic for the smallest grammar problem,” Proceedings of the fifteenth annual conference on Genetic and evolutionary computation conference - GECCO ’13, 2013. ISBN:978-1-4503-1963-8 doi:10.1145/2463372.2463441, p. 488
  2. 2.0 2.1 2.2 Markus Lohrey; Sebastian Maneth; Manfred Schmidt-Schauß (2009). "Parameter Reduction in Grammar-Compressed Trees". Proc. FOSSACS. LNCS. 5504. Springer. pp. 212–226. http://www.eti.uni-siegen.de/ti/veroeffentlichungen/09-fossacs.pdf.