An optical character recognition (OCR) engine that is used to convert images of text into machine-encoded text.
"To digitize the text from a picture, we used the Tesseract OCR."
A code that identifies a specific language. For Tesseract and other systems, these codes typically follow the ISO 639-1 or ISO 639-2 standards.
"To process English text with Tesseract, we have to set the language code to 'eng'."