Hacker News new | past | comments | ask | show | jobs | submit login

PaddleOCR seemed to be a good library for locating and translating text. I've been puzzling over how to translate something like a simple letter form into a LLM translatable format.

I think the serious problem is most of these LLMs are already built on-top of garbage so you're already the GI and just trying to match that as best you can.




I built a library around this problem [1]. I recently did some experimenting with PaddleOCR but found the results very underwhelming (no spacing between text) - seems like it's heavily optimized for Chinese. There was a 3 year old GitHub issue around it and seems like it still has this issue out of the box. I'd be curious to hear other people's experience with it.

[1] https://github.com/Filimoa/open-parse/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: