Hacker News new | past | comments | ask | show | jobs | submit login

This is interesting. How much difference is it (in cost, quality) by using this approach compared to taking a image capture of the page and then sending it off to a multi modal LLM?



Good question, I actually haven't tried it with the image capture approach. I'll give that a shot and see how it performs. I'm planning to try many different AI extractors, and see which performs best.

So far, I've done some un-scientific testing to compare text vs. HTML. Text is a lot more effective on a per-token basis, and therefore lower cost. However, some data is only available in HTML.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: