Hacker News new | past | comments | ask | show | jobs | submit | flo_hu's comments login

I removed my blog post from medium's distribution. So it should now be freely accessible! https://blog.esciencecenter.nl/king-man-woman-king-9a7fd2935...


There's several use-cases listed where Markdown could be used for, such as paper writing, presentation etc (I'm not sure I am going to use it for the later in the near future...).

I find that one of the most interesting potential use-cases is in "literate programming" on which the same author wrote another blog post: https://blog.esciencecenter.nl/entangled-1744448f4b9f


No. That's of course still "King" :)

(but sure one could also pick queen, prince, royal form the list...)

Just tested it here: http://vectors.nlpl.eu/explore/embeddings/en/calculator/#

And it gave me 0.63 King, 0.6 Prince etc...


So =Prince because you should exclude King similarly how you exclude to get Queen in original example.


I would also agree with you that it is fine to add additional rules to improve the outcome, but than it shouldn't be made clear in the way the result is presented (as you say, that rarely happens in intro-level tutorials/courses).

Your last point sounds like a cool idea! Using those more in-depth metrics to find weaknesses and see if other, complementary algorithms can fill the gap.


*should?


Good point! I would see this rather as yet another argument for why you should simply give the actual output of the NLP algorithm.

So if people actually do the calculation King-Man+Woman and it comes closest to King, than they should report "King-Man+Woman~=King" and not "King-Man+Woman=Queen" (only because that's what they expected).


To be honest, I think the idea that we should expect ML algorithms to give a single, certain answer is misguided. I would expect the output from this algorithm to be "King - Man + Woman = King (90%), Queen (83%), Prince (70%)" or something like that, i.e. a list of answers with some measure of how "good" those answers are. Then again, I work in a field that doesn't really have categorical answers so maybe I'm missing something obvious.


That's pretty much correct. You would typically calculate a vector for "King-Man+Woman" and then do a query on this based on a cosine distance (or similar measure) over the entire vocabulary.

The query would give you a ranked list of the closest word vectors with scores that indicate how good the match is.


But the example is only performing vector operations. You could perhaps normalize the distances of a number of vectors with a softmax or something to produce a probability across a set, but what's being presented in the paper is the "closest" vector following the operations in terms of cosine distance.


In the end, it doesn't matter what transformation you do though, as long as you do it consistently and/or not in an ad hoc manner. If excluding the original term always leads to useful results, it is a useful transformation.

The problems materialize when you're just cherry picking for results.


Exactly! I think that was part of the problem for many of the examples that turn out to not-really-work. People pretend they let the work do by an algorithm, but then hand-pick from a list of somewhat close candidates. Which of course happens with a hypothesis (and thereby desired outcome) in mind.


I'm fine with the free lunch thing. But here the cheating is done on the level of how people present the capabilities of the tool. If you ask the algorithm how "SHE is to LOVELY as HE is to X", the reported answer (Bolukbasi 2016) was "BRILLIANT", which in this case suggests a heavy gender-bias. But what the algorithm actually gives for X is: "LOVELY". The authors justed picked the 10th example in the list without clearly stating it.


> The authors justed picked the 10th example in the list without clearly stating it.

That's not an accurate description what Bolukbasi et al (2016) [0] did. In particular, they do not list x close to lovely + he - she and then pick arbitrarily from that list. Instead, they explicitly reject that approach (see appendix A), because they're looking for pairs of words that are maximally gendered. They do that by finding x and y such that the angle between x - y and she - he is minimized. Since the task they're solving is different, you can't fault them for getting different results.

[0] https://arxiv.org/abs/1607.06520


Ok, thanks a lot for bringing this up! I will have a closer look at that.


"Cheating" (in practice) usually means "embedding problem/domain specific quirks"

In the "King" example, you're adding and subtracting two words that are probably very close already, so if you want to find "something else" besides itself, you need to exclude it. For some problems it might make sense, for some others it might not.


Despite all great advances in deep learning and big data, scientific research often is more about getting the most from very little data.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: