The word2vec implementation has many details that are unmentioned or at least not emphasized much in the paper. The source is also not very commented if memory serves.
This is another paper that's basically just about some details of word2vec and GloVe and their effects on the results:
This is another paper that's basically just about some details of word2vec and GloVe and their effects on the results:
Improving Distributional Similarity with Lessons Learned from Word Embeddings - ACL Anthology https://www.aclweb.org/anthology/Q15-1016/