tesseract 无法处理 *.bmp 文件

tesseract 无法处理 *.bmp 文件

Tesseract 无法处理 *.bmp 文件。它出现此错误。

Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica
Error in pixReadMemBmp: size incommensurate with image data
Error in pixReadStream: bmp: no pix returned
Error in pixRead: pix not read
Error during processing.

超正方体-v

tesseract 4.00.00alpha
 leptonica-1.74.4
  libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.4.4 : libopenjp2 2.1.0

 Found AVX2
 Found AVX
 Found SSE

稳定版本

tesseract 4.0.0
 leptonica-1.75.3
  libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0

 Found AVX2
 Found AVX
 Found SSE


Error in pixRemoveColormap: pixs must be {1,2,4,8} bpp
Error in pixGetDepth: pix not defined
Error in pixGetDepth: pix not defined
Error in pixGetWpl: pix not defined
Error in pixGetYRes: pix not defined
Error in pixClone: pixs not defined
Please call SetImage before attempting recognition.
Error during processing.

Ubuntu

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.3 LTS
Release:        16.04
Codename:       xenial

答案1

因此,如果您想在结果中使用 tesseract,请将 bmps 转换为(最好)tiff。

for a in *.bmp; do
  convert $a ${a%.*}.tiff
done

您需要 ImageMagick(转换)和 bash 来获取代码

相关内容