Looks very much web-based, and not cleaned properly. I conclude that because digits are pretty rare in a normal corpus, much rarer than x and y. The English list also has some punctuation included, and half of the Greek alphabet. The counting didn't exclude proper names and formulas, I suppose. So if you want to identify the domain of a Wikipedia page based on 1-grams, this is helpful; otherwise, less so.
https://web.archive.org/web/20221028111744/http://simia.net/...