Skip to the content.

Benchmarks

This page is dedicated to simple benchmarking of various tesseract version and options. As input image for testing is used image from issue 236.

Results

Build tessdata_best tessdata_fast tessdata
305 - - 2.4713
413noavx 37.6052 5.1589 10.1519
413avx 12.7300 2.9538 4.0860
501 6.1981 2.1241 2.9107
501ap 6.1369 2.1254 2.9221
501openmp 3.4590 1.9612 2.3554

Information about testing environment

Tested with python code

import timeit
import time
import os
import pytesseract

start_time = time.time()
tess_exe = r"msvc.v5.openmp\tesseract.exe"
test_image = r"i263_speed.jpg"
os.environ['TESSDATA_PREFIX'] = r"tessdata_best\tessdata"

code_to_test = """
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"{}"
pytesseract.pytesseract.image_to_string(r"{}", lang = 'eng')
"""

runs = 15
elapsed_time = timeit.timeit(code_to_test.format((tess_exe, test_image), number=runs)/runs
print("\nDuration:", elapsed_time)

Tesseract build info

information provided by tesseract -v

3.05

305

It uses the legacy engine.

tesseract 3.05.02
 leptonica-1.83.0 (Dec 17 2021, 17:33:37) [MSC v.1929 LIB Release x64]
  libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.0 : libopenjp2 2.4.0

4.1

413noavx

Build without AVX2/AVX/SSE4 support

tesseract 4.1.3
 leptonica-1.83.0 (Dec 17 2021, 17:33:37) [MSC v.1929 LIB Release x64]
  libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.0 : libopenjp2 2.4.0
 Found libarchive 3.5.1 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 libzstd/1.4.9

413avx

Build with AVX2/AVX/SSE4 support

tesseract 4.1.3-1-ge9986
 leptonica-1.83.0 (Jan 26 2022, 19:15:03) [MSC v.1929 LIB Release x64]
  libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11
 Found AVX2
 Found AVX
 Found FMA
 Found SSE
 Found libarchive 3.5.1 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 libzstd/1.4.9

5.0

501

AVX2 supported

tesseract 5.0.1
 leptonica-1.83.0 (Dec 17 2021, 17:33:37) [MSC v.1929 LIB Release x64]
  libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.0 : libopenjp2 2.4.0
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found libarchive 3.5.1 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 libzstd/1.4.9
 Found libcurl/7.75.0 zlib/1.2.11 libssh2/1.10.1_DEV

501ap

build with: cmake -E env CXXFLAGS="/Qpar /fp:fast" cmake ..

tesseract 5.0.1
 leptonica-1.83.0 (Dec 17 2021, 17:33:37) [MSC v.1929 LIB Release x64]
  libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.0 : libopenjp2 2.4.0
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found libarchive 3.5.1 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 libzstd/1.4.9
 Found libcurl/7.75.0 zlib/1.2.11 libssh2/1.10.1_DEV

501openmp

The OpenMP build is known to waste a lot of CPU time. As several users report problems, it is disabled by default in version 5.0.1 and later releases. For other versions (>= 4.x) it is suggested to use the environment variable OMP_THREAD_LIMIT=1. Input from OpenMP experts would be appreciated.

tesseract 5.0.1
 leptonica-1.83.0 (Dec 17 2021, 17:33:37) [MSC v.1929 LIB Release x64]
  libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.0 : libopenjp2 2.4.0
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found OpenMP 2019
 Found libarchive 3.5.1 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 libzstd/1.4.9
 Found libcurl/7.75.0 zlib/1.2.11 libssh2/1.10.1_DEV