答案1
我们应该使用选项来调用 tesseract-psm <N>
来进行页面设置:
0 = Orientation and script detection (OSD) only.
1 = Automatic page segmentation with OSD.
2 = Automatic page segmentation, but no OSD, or OCR.
3 = Fully automatic page segmentation, but no OSD. (Default)
4 = Assume a single column of text of variable sizes.
5 = Assume a single uniform block of vertically aligned text.
6 = Assume a single uniform block of text.
7 = Treat the image as a single text line.
8 = Treat the image as a single word.
9 = Treat the image as a single word in a circle.
10 = Treat the image as a single character.
感兴趣的选项是10
,6
如果我们的位图源中只有一个字符。
通过渲染灰度图像源如下
tesseract LO1v5.png -psm 6
我们将得到正确的结果8
,但是绿色图像来源对于专门研究整个文本而不是数字的 tesseract 来说,这是一个太大的挑战。
通过提高输入质量
在单字符识别模式下调用 tesseract 会得到更好的结果:
tesseract sourceimage -psm 10
这将为我们提供一个正确的猜测,8
但仅仅是一个几乎正确的B
猜测0
。