Publishing a tablet is much more than identifying the characters. There's often ...

mikhailfranco · on Oct 30, 2023

I agree OCR for handwriting is difficult.

A few years ago I obtained some scans of records from WWI for my grandfather's unit on the Western Front. All the reports were written in English, but using elaborate 'copperplate' script, which I imagine is very difficult for automatic OCR - it was difficult for me to read as a native English speaker and reader - imagine writing like that in some wet shell-straddled trench.

https://en.wikipedia.org/wiki/Copperplate_script

Once tablets are reconstructed, perhaps releasing some kind of 3D scans (raw laser and meshes) for an open competition, with decent prize could be productive (like the prize for the Herculaneum scrolls in the news this week).

wl · on Oct 30, 2023

The situation is even worse for cuneiform. With English, you're looking at approximately 70 glyphs including upper case, lower case, digits, and punctuation. Cuneiform throws you into the hundreds of glyphs. And their forms change over time, often drastically.

For 3D scanning, Reflectance Transformation Imaging is pretty cheap, easy, and popular for imaging tablets.

qingcharles · on Oct 30, 2023

Have you tried GPT-V? I often throw it crazy stuff like medieval handwriting and it seems to do a pretty good job of reading and translating it.

omneity · on Oct 30, 2023

This feels like an area where synthetic data can help a lot. It should be fairly "easy" to generate cuneiform-like characters, render them on procedurally generated clay tablets, break tablets using a physics engine and render the pieces in different angles. Training a model on recognizing puzzle pieces with this data would be pretty feasible too.

suoduandao3 · on Oct 30, 2023

weird question - I remember hearing that pre-writing societies have 'memory objects' that use a series of symbols to help prompt memorization of an epic poem or the like when held in one's hands. Might cuneiform been intended to be 'read' more like braille, by touch?