Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yep! That is one of the things we’ve worked the hardest on. Completely new indexing structure, using an append-only hash trie which scales really well. We’ve tested it with many big datasets including importing all of Wikipedia as files in a single folder. Worked like a charm :)


this one? https://dumps.wikimedia.org/other/static_html_dumps/current/... how long does it take to import it?


I think it was that one yes. Can’t remember the exact time it took, as we ran it over a couple of days due to some unrelated computer issues.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: