Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Among European languages, Serbo-Croatian is probably the closest to phonemic spelling. An interesting way to test this is to train a basic language model on a representative language, and then see how many mistakes it makes on words it doesn't know (https://aclanthology.org/2021.sigtyp-1.1/) - in this study, Serbo-Croatian scored over 99% for both reading and writing accuracy. Finnish and Turkish are also pretty good.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: