I did something like this in a project for my former work at a research group, a...

openasocket · on April 28, 2017

Thanks, that's helpful! And you can email me at <my HN username> at gmail.com.

I'm also curious if there are any ways to quantify, mathematically, the changes over time. There's the simple sum of the squares of the changes distance to get a sense of the "kinetic energy" of the system, but I'm wondering if there are some more clever analyses, especially something that can quantify localized changes versus global changes.

Edit: so are you running a separate word2vec thing for each year's dataset? If so, how to you map between them, because the orientation the word2vec mapping generates will be random, and I worry that trying to rotate the mappings to some common axis could obscure some of the data.

swf · on April 28, 2017

Sent! Yeah, we made a model for each year's dataset. In our case, we were only interested in the similarity between our target protein and the others, so we used the model's similarity measure between those in order to avoid problems with varying orientations between models.