Hacker News new | past | comments | ask | show | jobs | submit login

I've been looking a little bit at anaphora resolution in AI and there's the trouble that your phrase could appear in many contexts where Google's translation would be completely correct! If "cousin" is the antecedent of "her" (which is the only possibility if this phrase occurs in isolation), then Google's translation is clearly wrong. But if there is a preceding or following sentence which mentions a third person, Google's translation can be correct because "her" can then refer to that person.

For example "my cousin and her wife think that Sarah has good taste in ice cream": here one likely anaphora resolution is "my cousin and Sarah's wife think that Sarah has good taste in ice cream". Or "when Sarah got married, she invited my cousin to the wedding; my cousin and her wife turned out to have gone to college together" (again "my cousin and Sarah's wife turned out to have gone to college together").

Anaphora resolution in general is one of the hardest problems for machine translation because it appears to require so much knowledge about the world to do it as well as human beings do. But also, different resolutions can be correct (or maximum-probability) in different contexts depending on the additional information! For instance, there's the Winograd Schema structure where a single pronoun would be interpreted as referring to different people depending on the surrounding context (but not grammar). Winograd's classic example was

The city council members refused the demonstrators a permit because they feared violence.

The city council members refused the demonstrators a permit because they advocated violence.

Disturbingly for machine translation, in the former sentence "they" refers to "the city council members", while in the latter sentence "they" refers to "the demonstrators", even though the syntax of the two sentences is identical!

This, in turn, means that if a translation task required knowing the antecedent of "they" in "The city council members refused the demonstrators a permit because they", the translation task would have no unique solution because the antecedent is ambiguous. Formally this is also true of every reference to, for example, family members when one language marks gender and another doesn't, even if there is a likely resolution offered by the local context, as in your sentence. There is no unique translation available. Finding the one intended by the speaker will require more context, while even finding the one that other speakers find most probable with limited contexts is sometimes among the most challenges AI problems today.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: