Hacker News new | past | comments | ask | show | jobs | submit login

From this article:

More empirically, Shannon's lower estimate suggests that humans might be able to compress enwik9 down to 75MB, and computers some day may do better.




Notable that enwiki8/9 isn't really just human text - a good ~half of the data is random xml markup which may not have the same properties as text.


The current best compresses it to <15MB because it's not a human which that example is referring to. :)


That's enwik8. enwik9 is an order of magnitude larger.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: