Hacker News new | past | comments | ask | show | jobs | submit login

This is actually a really good example of the limitation of current ML/NLP approaches, that there isn't really any level of 'comprehension' at all.



> This is actually a really good example of the limitation of current ML/NLP approaches, that there isn't really any level of 'comprehension' at all.

That happens even with humans, so I'm not sure that follows. "Oh, sorry, you meant Mars, I heard Moon"


It would be more charitable to say that current ML/NLP approaches have orders of magnitude less understanding than humans. The ML/NLP process just jumps straight to the correlated answer, which a human would do, but then use higher order reasoning - "wait mars? that wasn't where he set foot".

This to me is similar to current autonomous driving limitations. The cars can only respond to situations they have seen before. Any novel elements lead to failure, where a human can fall back on higher order reasoning: "That is a stoplight yes, but it is on the back of a truck, ignore it" (real example)


I mean that there is no semantic comprehension or knowledge of 'Neil Armstrong' (as a specific person), 'land' (as an action), and 'Mars' (as a place). It's not a mishearing because computers don't mishear ASCII-inputs like we might do with auditory signal.

I get the same result when I Google 'when did Lance Armstrong land on the moon', 'when did Buzz Lightyear land on the moon', or 'when did Lance Armstrong land on Mars.'


I just learned they really sent Buzz Light-year into space. It somehow passed me by.


> That happens even with humans, so I'm not sure that follows.

When it happens with humans, it's also an example of non-comprehension, possibly for a different reason.

We don't see things as they are, we see them as we are.

Actually, perhaps that's the same reason.


If you ask the average American “When did Niel Gaiman set foot on the moon?” Most would answer that they don’t know exactly but think it was in the 60’s.

This is not a limitation of AI, it’s exactly what you want it to do. It’s reading into the context of the question and finding it more likely that you made a mistake in your question than seriously want an answer for a constructed nonsensical question that has no frame of reference or context in our common knowledge pool.

If you want exact logical answers deduced from base prepositions you don’t want ML models or “AI” your looking at formal logic and deduction.


> Most would answer that they don’t know exactly but think it was in the 60’s.

I think this is an important point. Humans and Google have a similar bias (for different reasons) that they want to be helpful or seem knowledgeable when asked a question, so won't say "I am confused by that question" and will rather give an answer to a question which is similar to a question that they can understand, in the hopes that it is approximately right.

In the case of asking when Neil Armstrong landed on Mars, guessing "I think it was in the 60's" is accurate to within 10 years but off by 56 million km. For a good example of what average people think about the solar system, though, consider the question "Is the moon really a planet?" that was once part of an impromptu debate on TV:

https://www.cnet.com/news/qvc-stars-confused-about-whether-t...


I dont think comparing yourself to an average idiot on the street is what google should strive for.

The way it used to work you had one intelligent entity forming a query and a machine performing said query. Somewhere along the line Google got this idea returning no result was a bad outcome and started overriding queries. Its like you go to the hardware store and ask for nails, but shopkeeper starts telling you all about nail saloons in the vicinity because they ran out of finishing nails - or worse, goes nail saloon -> spa -> finishing -> ending -> happy ending.

We went from intelligent entity forming a query and a machine performing said query to idiot computer treating you line an idiot.


In a school where I studied, something similar was used as a trick question during a history exam. "Which language Vladimir Lenin used to write correspondence addressed to Karl Marx?" or something like that. Nearly half of the class failed on this. To those unaware: Lenin discovered Marx's book, Capital, in 1887, while Marx died in 1883, so there could not be any correspondence.


More appropriately, you could ask which language did Karl Marx use to write fan mail to Abraham Lincoln? ;-)


That’s not the same kind of trick question, because Karl Marx did write fan mail to Lincoln.


The problem is that model is only being asked what the answer is most likely to be, not whether there exists a good answer.

There should be a different model that checks if there is an answer or not, like SQuAD 2.0 https://rajpurkar.github.io/SQuAD-explorer/


I think the conditional display of the fact box implies they are in fact asking whether there is a good (enough) answer. They are just getting it wrong here.


But at the same time, I think it's doing a really good job of what it's trying to do. Google search is not trying to be a repository for all the world's information. It's just trying to get people to what they're most likely looking for or show the most related things. Given the significance of the moon landing and the fact that no one has set foot on mars I find it unsurprising that it brings up info on the moon landing. It's seems better to assume what the user is likely looking for especially when (at least my) Google searches often take the form of "moon land neil year". I can just type things like that out, stream of consciousness, and the majority of the time I get what I'm looking for immediately.


There's a few issues here.

The first is that Google has specifically chosen to call out an answer in some kind. If the query is reasonably framed as a question, there is a clear indication in the UI that the response is meant to be an answer to that question.

Now it's definitely the case that a lot of questions have some amount of semantic ambiguity that a listener would have to resolve. For example, a question about a "president" can reasonably be inferred to mean specifically a "US president" of some kind, at least if the query is from the US and is in English.

And sometimes people can ask questions where there's a confused detail. And responding with the question they probably meant to ask is not unreasonable.

However--and this is a big however--it is incumbent to emphasize that the answer is for a different question than the one that was literally asked. You see this when you do searches of misspelled terms: "did you mean this one instead?" Because occasionally, no, you did mean the term that has much fewer results.

And this kind of emphasize-the-answer can have poor results sometimes. Ask Google which president became supreme dictator. The answer makes it clear why it thinks that, but... that's a really different question from the one that was asked.


If someone asked you in-person "when did Neil Armstrong set foot on Mars?", would you just say "July 20, 1969"? Or would you say "nobody has been to Mars, but if you're talking about the moon..."

Google's response here only makes sense if Google said "Did you mean: When did Neil Armstrong land on the moon?"


> isn't really any level of 'comprehension' at all.

I mean, that's a bit harsh. I bet there are lots of people who would answer the question the exact same way. They don't have a total lack of comprehension, they perhaps misheard a word or misremembered a fact they once knew. Honestly, while the Google answer is wrong and this demonstrates a major flaw in their confidence of answering queries, the level of comprehension is still quite impressive (to me, at least)


While I agree that "no comprehension at all" seems harsh, I don't think anyone would jumble things up between moon and mars unless they got the fact wrong in the first place.

These knowledge cards are pretty useful, but they shouldn't be taken as the source of fact (at least right now) unless one opens the link to check where that card is extracted from.


More specifically, you can get a human to answer like this, but that's when the human doesn't comprehend the question properly

I mean, you probably could get people to answer 1969 for when Neil Armstrong set foot on the poop, but only if they didn't understand the word "poop"


Yes, but a human who don't understand a word lacks the vocabulary for that specific language and that's the reason they cannot understand or comprehend the question properly. Whereas Google Search does have the vocabulary built in with its data set. The problem here is it didn't piece the words together correctly even though the question is valid.

It is one thing to not understand a word in the first place and another if you cannot piece it together i.e. assuming you know the language.


I agree, but it is a type of failure of comprehension (Google Search doesn't understand the word "poop" either, it's just aware that it crops up in a lot of contexts and can rote-recall multiple definitions for it). Same with the human giving a date for Nixon landing on the moon because they understand moon landings but have no comprehension of who Nixon was (much like Google, which additionally has the unhelpful correlation between mentions of Nixon and mentions of moon landings to factor in)


But in human's case, the lack of understanding of word Nixon is because they didn't know what that means in the first place and they don't have data to know or cross check what that word means. On the other hand, Google Search does have an access to the data (just strings in this case from an article) which it has to verify based on the user's query and then display it as the knowledge card and yet it couldn't because... (I don't know why exactly).


I literally read the title as moon, because that’s what my brain expected the 4 letter word beginning with ‘m’ after the words ‘Neil Armstrong’ to be.

I was wondering what on earth was interesting about this post!


That isn't the question of your comprehension though. You got the input (question) wrong because of your habit and not because you couldn't comprehend the sentence structure. You misread the sentence and that's why you got confused over what is wrong if I get it properly. You most probably wouldn't have done this if you read it as Mars in the first place.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: