In college (about 15 years ago) I worked for a professor who was compiling precint level results for old elections. My job was just to request the info and then do manual data entry. It was abysmally slow.
This application seems very good - but still a bit amazing that lawmakers haven't just required that all data be uploaded via csv! Even if every csv was slightly different format, it would be way easier for everyone (LLM or not).
I could be wildly off-base, but I wonder if some of these systems are airgapped, and the only way the data comes off of the closed system is via printing, to avoid someone inserting a flash drive full of malware in the guise of "copying the CSV file." Obviously there are or should be technical ways to safely extract data in a digital format, but I can see a little value in the provable safety that airgapping gives you.
Sure, but it'd take a literal Act of Congress to force all these states to force all their independent vendors to do a thing, so good luck. And each vendor would probably charge about a million dollars to each state to do the work, in government contracting world. So, probably better to just use AI to OCR them.
This application seems very good - but still a bit amazing that lawmakers haven't just required that all data be uploaded via csv! Even if every csv was slightly different format, it would be way easier for everyone (LLM or not).