Hacker News new | past | comments | ask | show | jobs | submit login

The paper itself says the only change is normalizing by the context window size C.



Ah, but I've now looked at their code, and it's not the only change! They've also eliminated the `reduced_window` method of weighting-by-distance that's present in `word2vec.c`, Gensim, and FastText.

What if that's the real reason for their sometimes slightly-better, sometimes slightly-worse performance on some benchmarks? Perhaps there are other changes, too.

This is why I continue to think Gensim's policy of matching the reference implementations from the original authors, at least by default, is usually the best policy – rather than using an alternate interpretation of the often-underspecified papers.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: