Deep Learning, NLP, and Representations

Houshalter · on March 24, 2015

Everything on this guy's blog is fantastic. I love the post on neural network manifolds (http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/).

jeremysalwen · on March 25, 2015

When I experimented with word vector representations (word2vec), I found that the analogous embedding part was a bit oversold in a subtle way. While some sorts of analogies worked (the easiest to see was gender: woman-man+boy=girl), other sorts of analogies that appeared just as straightforward to a human were not at all understood by the embeddings.

For example,

water-boat+plane=potable_water (air, sky, etc nowhere to be found).

small-big+tall=taller (short, etc nowhere to be found).

shell-pecan+coconut=shells (husk nowhere to be found).

What I eventually realized is that the vector representations only really work in the case where an averaging operation would get the right answer anyways. For example, woman-man+boy= girl, but you can also get that through (woman+boy)/2=girl. (or even woman+boy=girl) In my experimentation I have yet to see a case where the subtraction aspect is actually necessary to recover the relationship. This would indicate that the vectors are only capturing similarity between words, and not and not any second order relationships between them as is commonly claimed.

Honestly, I wish I could discuss this directly with someone doing research in this area, because it seems to me that the vector representations really aren't capturing relationships in the way they're touted. I think it's an interesting possibility, and I haven't seen it addressed in any of the literature I've read.

EDIT: On further investigation, it looks like the google news vectors maybe just aren't that good? I wasn't able reproduce hardly any of the analogy examples given in the table "Relationship pairs in a word embedding." from the OP.

nl · on March 25, 2015

The maths works when you are acting along the same vector(s), so things like king:male ~= queen:king work. In that example you are doing maths along the 'gender vector'.

In you water/boat/plane example I'm not sure what you are expecting. It seems unlikely that there is a 'planeiness' vector one can move along to find concepts like that.

navanit · on March 25, 2015

The vector for the water/boat/plane is "medium".

nl · on March 25, 2015

Yes, we understand that. Not sure word2vec does.

From memory the default word2vec implementation and data gives you ~500 dimensions. I think some of the examples given are failing because of that limitation.

robrenaud · on March 25, 2015

I recommend you read the paper here:

http://nlp.stanford.edu/projects/glove/

It gives intuition/light math why the analogy as translation in embedding space should work.

Houshalter · on March 25, 2015

I've read that multiplication/division works better than addition/subtraction. Like king/man\*woman, or something like that.

madisonmay · on March 25, 2015

Always excellent content from Christopher Olah. So few people do such a clear job of explaining these concepts -- this is technical writing done right.

on March 25, 2015

[deleted]

hiddencost · on March 25, 2015

Because neuroscience is not yet very informative about what will work well or efficiently. Reading the words "neural network" and thinking it has anything to do with the brain is a mistake.