Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I always think of LLMs as offshore teams with a strong cultural aversion to saying "no".

They will do ANYTHING but tell the client they don't know what to do.

Mocking the tests so far they're only testing the mocks? Yep!

Rewriting the whole crap to do something different, but it compiles? Great!

Stopping and actually saying "I can't solve this, please give more instructions"? NEVER!



This is exactly how dumb these SOTA models feel. A real AI would stop and tell me it doesn't know for sure how to continue and that it needs more information from me instead of wild guessing. Sonnet, Opus, Gemini, Codex, they all have this fundamental error that they are unable to stop in case of uncertainty. Therefore producing shit solutions to problems i never had but now have..


This is a feature, not a bug. In chatbot mode and in coding, the vast majority of consumers do not have the critical thinking skills necessary to realise the models are making stuff up, so the AI companies are incentivized to train accordingly. When the same models are used for agent mode the problem is just way more glaring, they don't respect (or fear) the terminal as much as they should, try to give the user some positive output and here we are


I don't see a reason to believe that this is a "fundamental error". I think it's just an artifact of the way they are trained, and if the training penalized them more for taking a bad path than for stopping for instructions, then the situation would be different.


It seems fundamental, because it’s isomorphic to the hallucination problem which is nowhere near solved. Basically, LLMs have no meta-cognition, no confidence in their output, and no sense that they’re on ”thin ice”. There’s no difference between hard facts, fiction, educated guesses and hallucinations.

Humans who are good at reasoning tend to ”feel” the amount of shaky assumptions they’ve made and then after some steps it becomes ridiculous because the certainty converges towards 0.

You could train them to stop early but that’s not the desired outcome. You want to stop only after making too many guesses, which is only possible if you know when you’re guessing.


Fine. I'll cancel all other ai subscriptions if finally an ai doesn't aim to please me but behaves like a real professional. If your ai doesn't assume that my personality is trump-like and needs constant flattery . If you respect your users on a level that don't outsource RLHF to the lowest bider but pay actual senior (!) professionals in the respective fields you're training the model for. No Provider does this - they all went down the path to please some kind of low-iq population. Yes, i'm looking at you sama and fellows.


I think that it will take more time, but things do seem to be going in this direction. See this on the front page at the moment - https://news.ycombinator.com/item?id=44622637


These things are intelligent in the same way Aloy of Horizon fame is brave.


Well companies seem to absolutely love offshoring at the moment so these kind of LLMs are probably an absolute dream to them

(And imagine a CTO getting a demo of ChatGPT etc and being told "no, you're wrong". C suite don't usually like hearing that! They love sycophants)


Except offshore teams "tell" you they can’t do what you want, they just do it using cultural clues you don’t pick up. LLMs on the other hand…


I think we just haven't figured out that "let's try a different approach" is actually a desperate plea for help.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: