Hacker News new | past | comments | ask | show | jobs | submit login

I use some custom tools for PDF comparison (visual, textual, and perceptual hash) for my personal records/accounting purposes.

A number of the financial and medical institutions I deal with re-generate PDFs every time you request them, but the content is 99-100% identical. Sometimes just a date changes. So I use a perceptual hash and content comparison to automate detecting truly new documents vs. ones that are only slightly changed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: