Technical papers describing various aspects of Tesseract

Copyright Notice

Material posted here is copyrighted and may not be sold or distributed without permission of the respective copyright holder.

Reading the Papers

The links below take you to PDF download.

IEEE Copyright Materials

The following materials appeared in IEEE publications, and each carries an IEEE copyright designation. Papers may not be sold or distributed further without written permission of the IEEE.

An Overview of the Tesseract OCR Engine

Hybrid Page Layout Analysis via Tab-Stop Detection

ACM Copyright Materials

Adapting the Tesseract Open Source OCR Engine for Multilingual OCR

©ACM, 2009. This is the authors’ version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the International Workshop on Multilingual OCR 2009, Barcelona, Spain July 25, 2009. https://dl.acm.org/citation.cfm?id=1577804

Other publications from Ray Smith

Ray Smith Publications
The extraction and recognition of text from multimedia document images by Smith, R.W. (Ph.D. thesis), 1987
Slides from Tutorial on Tesseract presented at DAS2014
Slides from Tutorial on Tesseract presented at DAS2016

Other

Video PhotoTechEDU Day 11: Document Image Analysis with Leptonica
Training Tesseract for Ancient Greek OCR by Nick White
Shirorekha Chopping Integrated Tesseract OCR Engine for Enhanced Hindi Language Recognition by Nitin Mishra, C. Patvardhan, C. Vasantha Lakshmi, Sarika Singh
Report on the comparison of Tesseract and ABBYY FineReader OCR engines by Heliński, Kmieciak, and Parkoła
The hOCR Embedded OCR Workflow and Output Format (hOCR specification)
Text Detection on Nokia N900 Using Stroke Width Transform (with source code)
Generic Text Recognition using Long Short-Term Memory Networks - Ph.D. Thesis
Creating a Modern OCR Pipeline Using Computer Vision and Deep Learning
Translation-Inspired OCR by Dmitriy Genzel, Ashok C. Popat, Nemanja Spasojevic, Michael Jahr, Andrew Senior, Eugene le, Frank … Keywords-Optical character recognition; statistical machine … (character) locations in Arabic, English, and Hindi PRAN-data examples.
Developing Multilingual OCR and Handwriting Recognition at Google by Ashok Popat. Research Scientist, Google Inc. IAPR Summer School, Jaipur: Jan 23 2017.
General-Purpose OCR Paragraph Identification by Graph Convolutional Neural Networks by Renshen Wang, Yasuhisa Fujii, Ashok C. Popat January 2021