Hacker News new | past | comments | ask | show | jobs | submit login

I've done similar (but more single-use) things to extract text from PDFs, and data from PDF and PostScript plots. PDFs are actually surprisingly easy to dig into when they're decompressed (e.g. with pdftk), since they're mostly text based.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: