Hacker News new | past | comments | ask | show | jobs | submit login

It generalizes as well as GPT-3 does:

---

If you took the word "Wikipedia", and threw all the letters up in the air, what are some possible things they'd spell once they fell on the ground again?

If you took the word "Wikipedia" and threw all the letters up in the air, some possible things they could spell once they fell on the ground again might include "kawipieda," "keiapwidi," "kipedwaia," "pwakieida," or "eipawikid." However, without being able to see the exact arrangement of the letters on the ground, it's impossible to say for sure what they would spell.

---

(last sentence is a bit confused)




We don't know how well it is generalising to your input without knowing the training data. Failure cases however tell us a lot, such as the following:

Input: "If five cats can catch five mice in five minutes, how long will it take one cat to catch one mouse?"

Output: "If five cats can catch five mice in five minutes, then it will take one cat one minute to catch one mouse. This is because the number of cats and mice are inversely proportional, so as the number of cats decreases, the time it takes for a cat to catch a mouse will increase. Similarly, as the number of mice decreases, the time it takes for a cat to catch a mouse will decrease. Therefore, if five cats can catch five mice in five minutes, then it will take one cat one minute to catch one mouse."

As you can see, it fails at simple logic because it fails to understand that the number of mice aren't fixed in the question. Whereas a human would immediately pick that up because they reason beyond x=5 and y=5.


Are you sure a human would immediately catch this? The question is somewhat ambiguous and I bet if you posed this question to many people they would take the oversimplified non-gotcha approach and simply say one minute for one mouse just like the AI. Of course if you abstract out there are so many other variables at play but within the confines of a simple word question the answer is not necessarily incorrect.

You could probably test this by asking a few friends this question and see what they say. Outside of pure math problems you can get into an infinite regress defining the underlying first principles behind any given assumption.


> We don't know how well it is generalising to your input without knowing the training data

Are you claiming its training data has letter permutations of the word “Wikipedia”?

It’s actually pretty capable of doing basic combinatorics.


I am not claiming anything other than the fact that we do not know the training data therefore not much can be inferred about how well it generalises from some success case.


> Whereas a human would immediately pick that up because they reason beyond x=5 and y=5.

[Citation needed]

I think that the computer made the absolutely standard human mistake, so that could be considered a plus.


Quite interesting that it will make subtle errors in its otherwise reasonable-looking answer, e.g. "kipedwaia" has two "a"s; "kawipieda", "kipedwaia" and "pwakieida" have only two "i"s.

I have seen reports that it will happily hallucinate a plausible but wrong answer to all sorts of different prompts, intermixed with many mostly correct answers. It's interesting to think about how to place trust in such a system.


Peter Watts has a series called Rifters that explores this a little. "Smart gels" which are neural nets made up of a mishmash of cultured neruons and silicon that run most of society. They're trained just like neural nets today, and therefore their decision making process is basically a black box. They do a great job, but no one is really sure how they get there, but they work great and they're so much cheaper, so they who cares.

Anyhow spoiler alert, the neural nets running the virus response have been inadvertently trained to prefer simple systems over complex ones without anyone realizing, and decide that a planet with no life on it after being wiped out from the virus is infinitely more simple than the present one and starts helping it out instead of stopping it.

So short answer to your question is I would not place much if any trust and systems like that, in as far as anything that has high stakes, real world consequences.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: