Skip to the content.

Release Notes

This page keeps the most up-to-date release notes.

Table of Contents

IN DEVELOPMENT

API/ABI changes review for Tesseract

API/ABI changes graph

api_abi_changes.png

V5.3.4

Jan 18 2024

https://github.com/tesseract-ocr/tesseract/releases/tag/5.3.4

V5.3.3

Oct 05 2023

https://github.com/tesseract-ocr/tesseract/releases/tag/5.3.3

V5.3.2

Jul 11 2023

https://github.com/tesseract-ocr/tesseract/releases/tag/5.3.2

V5.3.1

Apr 01 2023

Improve the DebugDump output by slightly adjusting the format. By @GerHobbelt in PR #4022.

Bug fixes

CMake Build system

Compiler support

We dropped support for GCC and libstdc++ 8.x.

V5.3.0

Dec 22 2022

LSTM trainIng: Extend the function BoxFileName to handle another image name extension, .raw.png. By @bertsky in PR #3962.

Bug fixes

Build systems

V5.2.0

Jul 06 2022

V5.1.0

Mar 01 2022

V5.0.1

Jan 07 2022

CMake build:

V5.0.0

Nov 30 2021

V4.1.3

Nov 15 2021

Fix broken autotools build.

V4.1.2

Nov 14 2021

Changes in the Autotools build:

V4.1.1

Dec 26 2019

V4.1.0

Jul 07 2019

V4.0.0

Oct 29 2018

V3.05.02

Jun 19 2018

This release fixed a few bugs, backported from 4.0.0.

V3.05.01

Jun 1 2017

V3.05.00

Feb 16 2017

V3.04.01

Feb 16 2016

V3.04.00

Jul 11 2015

V3.03(rc1)

Feb 4 2014

V3.02.02

Oct 23 2012

V3.01

Oct 21 2011

V3.00

Sep 30 2010

V2.04

Jun 30 2009

V2.03

Apr 22 2008

2.02 was unrunnable, due to a last-minute “simple” change. 2.03 fixes the problem. It also adds an include check for leptonica to make it more usable.

V2.02

Apr 21 2008

V2.01

Aug 30 2007

(See also release notes for 2.00 below for usage information)

No major functionality change. Just a bunch of bug fixes.

No new data files for the original 6 languages. Use the files from v2.00. There are new data files for German Fraktur (deu-f) and Brazilian Portuguese (por).

STOP PRESS There is a minor bug in unicharset_extractor. Since this is only applicable to training, the main tarball is fine unless you need to run training, in which case, overwrite your unicharset_extractor.cpp and unicharset_extractor.exe with the ones in tesseract-2.01.patch1.tar.gz.

V2.00

Jul 18 2007

(See also release notes for 1.04 below for additional usage information)

First release of the International version. This version recognizes the following languages:

The language codes follow ISO 639-2. The default language is English. To recognize another language:

tesseract inputimage outputbase -l langcode

To train on a new language, see TrainingTesseract2. More languages will be appearing over time.

List of changes in this release:

Warning: Tesseract 2.00 has undergone more compatibility testing than any previous version. There have even been fixes to make the accuracy more consistent across platforms. Having said that, there have been many changes to the code, and portability may have been broken, so 64 bit and Mac platforms may not work or even build as well as before.

V1.04

May 15 2007

Tesseract development is now done with Subversion and hosted at code.google.com (Previously we used CVS as a VCS and sourceforge.net for hosting).

Windows users only

Added a dll interface for windows. Thanks to Glen at Jetsoft for contributing this. To use the dll, include tessdll.h, import tessdll.lib and put tessdll.dll somewhere where the system can find it. There is also a small dlltest program to test the dll. Run with:

dlltest phototest.tif phototest.txt

It will output the text from phototest.tif with bounding box information.

New for Windows

The distribution now includes tesseract.exe and tessdll.dll which might work out of the box! There are no guarantees as you need VC++6 versions of MFC and CRT (at least) for it to work. (Batteries not included, and certainly no installshield.)

Important note for anyone building with make: i.e. anyone except devstudio users

This release includes new standardization for the data directory. To enable Tesseract to find its data files, you must either:

./configure
make
make install

to move the data files to the standard place, or:

export TESSDATA_PREFIX="directory in which your tessdata resides/"

(or equivalent) in your .profile or whatever or setenv to set the environment variable. Note that the directory must end in a /

HAVING tesseract and tessdata IN THE SAME DIRECTORY DOES NOT WORK ANY MORE.

All users

Fixed a bunch of name collisions - mostly with STL. Made some preliminary changes for unicode compatibility. Includes a new data file (unicharset) and renaming of the other data files to eng. to support different languages. There are also several other minor bug fixes and portability improvements for 64 bit, the latest visual studio compiler etc.

Thanks to all who have contributed these fixes.

NOTE: This is likely to be the last English-only release! Apologies in advance to non-windows users for bloating the distribution with windows executables. This will probably get fixed in the next release with the multi-language capability, since that will also bloat the distribution.

V1.03

Feb 03 2007

V1.02

Oct 04 2006

V1.01

Sepr 07 2006

V1.00

Jun 17 2006

First open source version of Tesseract!

Hosted at sourceforge.net. CVS is used for version control.