It would be interesting to see how Tesseract holds up against this.

peterburkimsher · on Dec 20, 2017

Pingtype is my program for learning Chinese. I tried to use Tesseract to recognise some Chinese text, taken directly from a PDF that couldn't copy-paste for some reason. The results were awful.

I tried again with English text. I wanted a word list from a book that helps people learn English, so I took photos of the index. The format is word....page #, in two columns.

The results were just as bad.

I've given up on OCR, and decided I have to transcribe everything by hand. I only do it in my free time, and it's been taking months.

Is there any tool that can take a photo of a book where the pages curl towards the middle, and "flatten" it so that OCR will work better?

mohi13 · on Dec 19, 2017

whats that? Thor :P ?

correlation · on Dec 19, 2017

Hehe - well I was referring to the major os OCR lib. supposedly have LSTM-stuff in the next major rev https://github.com/tesseract-ocr/tesseract/blob/master/READM...

jremmons · on Dec 19, 2017

I have used tesseract and in my experience unless you train it for the particular type of text you want to recognize (font, background color, etc.) it will do quite poorly (including the recent lstm based versions). Would be great to see how it stacks up against these APIs though.

tensor · on Dec 19, 2017

The supplied models are trained on document-like images, so I wouldn't expect it to do particularly well on things like street signs. My experience with the new lstm based versions is that it's very much competitive with closed source solutions for document-like OCR.