Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think Gemini is one of the best example of an LLM that is in some cases the best and in some cases truly the worst.

I once asked it to read a postcard written by my late grandfather in Polish, as I was struggling to decipher it. It incorrectly identified the text as Romanian and kept insisting on that, even after I corrected it: "I understand you are insistent that the language is Polish. However, I have carefully analyzed the text again, and the linguistic evidence confirms it is Romanian. Because the vocabulary and alphabet are not Polish, I cannot read it as such." Eventually, after I continued to insist that it was indeed Polish, it got offended and told me it would not try again, accusing me of attempting to mislead it.



as soon as an LLM makes a significant mistake in a chat (in this case, when it identified the text as Romanian), throw away the chat (or delete/edit the LLMs response if your chat system allows this). The context is poisoned at this point.


>Eventually, after I continued to insist that it was indeed Polish, it got offended and told me it would not try again, accusing me of attempting to mislead it.

I once had Claude tell me to never talk to it again after it got upset when I kept giving it peer reviewed papers explaining why it was wrong. I must have hit the tumbler dataset since I was told I was sealioning it, which took me back a while.


Not really what sealioning is, either. If it had been right about the correctness issue, you’d have been gaslighting it.


I find that surprising, actually. Gemini is VERY good with Sanskrit and a few other Indian languages. I would expect it to have completely mastered European languages.


That's hilariously ironic given that all LLMs are based on the transformer algorithm, which was designed to improve Google Translate.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: