Hacker News new | past | comments | ask | show | jobs | submit login

I would be very interested if someone is aware of any small/tiny models to perform OCR, so the app can translate pictures as well





MiniCPM-V 2.6 isn't that small (8b) but it can do this.

Here is a demo.

* https://i.imgur.com/pAuTeAf.jpeg

Using this script:

* https://github.com/jabberjabberjabber/LLMOCR/




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: