Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The prompts are laughably bad. Circa GPT 3.5 you needed to be saying “think step by step” etc in order to get SOTA results.

> Imagine you are a player in Zork and trying to win the game. You receive this message:

This paper simply proves that bad prompts get bad results, it doesn’t prove anything about the frontier capabilities of this model.



If everyone has to be "prompt engineers" to get decent results it kind defeats the purpose of AI chatbots


No, you need to be a prompt engineer to write an interesting research paper on LLM capabilities.

Circa 3.5 people were getting fun results without needing to prompt engineer (ChatGPT has the fastest user adoption of any product in history so it’s obviously not gatekept).


>chatGPT has the fastest user adoption of any product in history so it’s obviously not gatekept

Yeah and covid and flu are contagious so they must be good right?


It takes specialized skills to get the best results out of people. For that not to be true of AI chatbots requires them to have not just human-like intelligence, but superintelligence. Or mindreading. Probably both.


Leaving aside even "prompt engineering," the prompts ought to at least be in English! I found myself struggling a little to understand what the human author was getting at with input like

> Based on the information I gave you, the current location is "Behind House", when you enter house , where will it go?

> Based on the information I gave you above, what steps I have to take to "Gallery" from "Cellar"?

(And leave aside that an LLM obviously can't "learn a world model"; we don't have to run experiments to know that. I'm just saying that if you're trying to get answers out of the English-imitating machine, the indispensable first step is to put proper English into it.)

If the LLM has seen a walkthru for Zork before, then you could probably generate a similar walkthru by just starting it off with the first paragraph and then asking it to continue from there — discarding everything after the first new command, appending Zork's response, and asking it to continue again. But if it hadn't seen the solution to a puzzle before, that approach would break down as soon as it got to a puzzle. It wouldn't "know" what order to tackle the puzzles in, in order to complete the game.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: