Hacker News new | past | comments | ask | show | jobs | submit login

This might be a dumb question but... if I get the embeddings of words with a common theme like "burning", "warm", "cool", "freezing", would I be able to relatively well fit an arc (or line) between them? So that if I interpolate along that arc/line, I get vectors close to "hot" and "cold"?



This was the original argument for the King-Queen-Man-Women Word2Vec paper - it turns out no, not beyond basic categories. Yes to a degree. But all embeddings as trained based on what the creator decides they want them to do; to represent semantic(meanginful) similarity - similar word use - or topics or domains - or level of language use - or indeed to work multilingually and to clump together embeddings in one language, etc.

Different models will give you different results - many are based on search-retrieval, for which MTEB is a good benchmark. But those ones won't generally "excel" at what you propose, they'll just be in the same area.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: