Sayre's paradox

From HandWiki

Sayre's paradox is a dilemma encountered in the design of automated handwriting recognition systems. A standard statement of the paradox is that a cursively written word cannot be recognized without being segmented and cannot be segmented without being recognized.[1][2] The paradox was first articulated in a 1973 publication by Kenneth M. Sayre, after whom it was named.[3]

Nature of the problem

It is relatively easy to design automated systems capable of recognizing words inscribed in a printed format. Such words are segmented into letters by the very act of writing them on the page. Given templates matching typical letter shapes in a given language, individual letters can be identified with a high degree of probability. In cases of ambiguity, probable letter sequences can be compared with a selection of properly spelled words in that language (called a lexicon).[4] If necessary, syntactic features of the language can be applied to render a generally accurate identification of the words in question.[5] Printed-character recognition systems of this sort are commonly used in processing standardized government forms, in sorting mail by zip code, and so forth.

In cursive writing, however, letters comprising a given word typically flow sequentially without gaps between them. Unlike a sequence of printed letters, cursively connected letters are not segmented in advance. Here is where Sayre's Paradox comes into play. Unless the word is already segmented into letters, template-matching techniques like those described above cannot be applied. That is, segmentation is a prerequisite for word recognition. But there are no reliable techniques for segmenting a word into letters unless the word itself has been identified. Word recognition requires letter segmentation, and letter segmentation requires word recognition. There is no way a cursive writing recognition system employing standard template-matching techniques can do both simultaneously.

Advantages to be gained by use of automated cursive writing recognition systems include routing mail with handwritten addresses, reading handwritten bank checks, and automated digitalization of hand-written documents.[1] These are practical incentives for finding ways of circumventing Sayre's Paradox.

Avoiding the paradox

One way of ameliorating the adverse effects of the paradox is to normalize the word inscriptions to be recognized. Normalization amounts to eliminating idiosyncrasies in the penmanship of the writer, such as unusual slope of the letters and unusual slant of the cursive line.[4] This procedure can increase the probability of a correct match with a letter template, resulting in an incremental improvement in the success rate of the system. Since improvement of this sort still depends on accurate segmentation, however, it remains subject to the limitations of Sayre's Paradox. Researchers have come to realize that the only way to circumvent the paradox is by use of procedures that do not rely on accurate segmentation.[1]

Directions of current research

Segmentation is accurate to the extent that it matches distinctions among letters in the actual inscriptions presented to the system for recognition (the input data). This is sometimes referred to as “explicit segmentation”.[4] “Implicit segmentation,” by contrast, is division of the cursive line into more parts than the number of actual letters in the cursive line itself. Processing these “implicit parts” to achieve eventual word identification requires specific statistical procedures involving Hidden Markov Models (HMM).

A Markov model is a statistical representation of a random process, which is to say a process in which future states are independent of states occurring before the present. In such a process, a given state is dependent only on the conditional probability of its following the state immediately before it. An example is a series of outcomes from successive casts of a die. An HMM is a Markov model, individual states of which are not fully known. Conditional probabilities between states are still determinate, but the identities of individual states are not fully disclosed.

Recognition proceeds by matching HMMs of words to be recognized with previously prepared HMMs of words in the lexicon. The best match in a given case is taken to indicate the identity of the handwritten word in question. As with systems based on explicit segmentation, automated recognition systems based on implicit segmentation are judged more or less successful according to the percentage of correct identifications they accomplish.

Instead of explicit segmentation techniques, most automated handwriting recognition systems today employ implicit segmentation in conjunction with HMM-based matching procedures.[1] The constraints epitomized by Sayre's Paradox are largely responsible for this shift in approach.

References

  1. 1.0 1.1 1.2 1.3 Vinciarelli, Alessandro (April 2003). Offline Cursive Handwriting: From Word To Text Recognition (PhD). IDIAP.
  2. Fischer, Andreas; Frinken, Volkmar; Bunke, Horst (2013). "Chapter 17 - Hidden Markov Models for Off-Line Cursive Handwriting Recognition". in Rao, C.R.; Govindaraju, Venu (in english). Handbook of Statistics. Elsevier. pp. 421–442. doi:10.1016/B978-0-444-53859-8.00017-5. ISBN 9780444538598. https://www.sciencedirect.com/science/article/pii/B9780444538598000175. 
  3. Sayre, Kenneth M. (1973). "Machine recognition of handwritten words: A project report" (in en). Pattern Recognition (Pergamon Press) 5 (3): 213–228. doi:10.1016/0031-3203(73)90044-7. ISSN 0031-3203. Bibcode1973PatRe...5..213S. https://www.sciencedirect.com/science/article/abs/pii/0031320373900447. 
  4. 4.0 4.1 4.2 Vinciarelli, Alessandro (July 2002). "A survey on off-line Cursive Word Recognition" (in english). Pattern Recognition 35 (7): 1433–1446. doi:10.1016/S0031-3203(01)00129-7. ISSN 0031-3203. Bibcode2002PatRe..35.1433V. https://www.sciencedirect.com/science/article/pii/S0031320301001297. 
  5. Maroneze, André O.; Coüasnon, Bertrand; Lemaitre, Aurélie (2011-01-24). "Introduction of statistical information in a syntactic analyzer for document image recognition". in Agam, Gady; Viard-Gaudin, Christian. Document Recognition and Retrieval XVIII. 7874. SPIE. pp. 28–38. doi:10.1117/12.873393. https://www.spiedigitallibrary.org/conference-proceedings-of-spie/7874/787404/Introduction-of-statistical-information-in-a-syntactic-analyzer-for-document/10.1117/12.873393.short. 

External links