This is interesting. How much difference is it (in cost, quality) by using this ...

marcell · 2024-09-04T06:43:18.000000Z

Good question, I actually haven't tried it with the image capture approach. I'll give that a shot and see how it performs. I'm planning to try many different AI extractors, and see which performs best.

So far, I've done some un-scientific testing to compare text vs. HTML. Text is a lot more effective on a per-token basis, and therefore lower cost. However, some data is only available in HTML.