

There is no “canonical” text layer behind the image of text, there’s a large number of possible text characters and words. In contrast, what Evernote does is take the image of text (or handwriting) and develop a tree of candidate characters and words. Is it a cursive L or a cursive capital I? This is one reason why handwriting is typically not recognized by OCR engines: while typed characters can be relatively unambiguous, a handwritten character is very ambiguous at the best of times.

Sometimes the text is wrong (a poor quality image, different language in a single document, bad OCR engine, etc), but hopefully most of the time it’s right. Conventional OCR usually comes up with a single text character corresponding to the image of the character, and maps that text to the image. There’s a brief explainer about the implications here:ĭ/topic/5 … ent=301397Įvernote is optically recognizing text, but does something different with this than conventional OCR. Just to add quickly: Evernote isn’t doing OCR in the way that we normally think of it.
