Pika parser
In computer science, a pika parser[1] is a parser that applies PEG rules right to left, bottom-up, using dynamic programming, which is the inverse of the normal recursive descent or packrat parsing order[2] of top-down, left to right.
Properties
By parsing in reverse, pika parsing solves the left recursion problem, allowing left-recursive rules to be used directly in the grammar without them having to be rewritten into non-left-recursive form,[3] and also conveys optimal error recovery capabilities upon the parser, which historically proved difficult to achieve for recursive descent parsers.[4] The ability to recover from syntax errors is critical to the function of IDEs and compilers.
Performance
Like a packrat parser, a pika parser takes time linear in the length of the input and the depth of the parse tree. However, for very large grammars, Pika parsing incurs a sizeable constant multiplier per input character, which may mean that other parsers with super-linear parse time are faster for short to moderate inputs. For smaller grammars, e.g. an expression grammar, Pika parsing may be significantly faster than other parsers (as shown in the original paper).
Name origin
Pika parsing is named for the pika, which, like a packrat, stores food for the winter in "haystacks" or caches. (The earlier packrat parsing method got its name from the use of a memo table to store function call parameters and their results for later reuse, to avoid duplicated work.)
References
- ↑ "Pika parsing: reformulating packrat parsing as a dynamic programming algorithm solves the left recursion and error recovery problems". 2020-05-31. https://arxiv.org/pdf/2005.06444v3.pdf.
- ↑ "Packrat Parsing: a Practical Linear-Time Algorithm with Backtracking". Massachusetts Institute of Technology. 2002-09-03. https://pdos.csail.mit.edu/~baford/packrat/thesis/. [1]
- ↑ "Left Recursion in Parsing Expression Grammars". Science of Computer Programming 96: 177–190. 2014-02-13. https://arxiv.org/pdf/1207.0443.pdf. Retrieved 2020-06-18.
- ↑ "Automatic Syntax Error Reporting and Recovery in Parsing Expression Grammars". Science of Computer Programming. v2 187: 102373. 2020-10-01. https://arxiv.org/pdf/1905.02145.pdf. Retrieved 2020-06-18.