Hacker News new | past | comments | ask | show | jobs | submit login

Indeed it is a significant performance boost. I believe I was the first to suggest using the alias method for w2v. I had extended it to work in a distributed setting a few years back [1] which allowed me to scale the problem to a dataset of 1 trillion words on a lexicon of 1.4BN words (roughly, the top 40 billion web pages of Yahoo's search engine).

[1] https://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14956




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: