> 2. Reviews are not just about the code's potential mechanics, but inferring an...

HankStallone · 2025-03-03T15:06:06 1741014366

It's like cutting and pasting from Stack Overflow, if SO didn't have a voting system to give you some hope that the top answer at least works and wasn't hallucinated by someone who didn't understand the question.

I asked Gemini for the lyrics of a song that I knew was on all the main lyrics sites. It gave me the lyrics to a different song with the same title. On the second try, it hallucinated a batch of lyrics. Third time, I gave it a link to the correct lyrics, and it "lied" and said it had consulted that page to get it right but gave me another wrong set.

It did manage to find me a decent recipe for chicken salad, but I certainly didn't make it without checking to make sure the ingredients and ratios looked reasonable. I wouldn't use code from one of these things without closely inspecting every line, which makes it a pointless exercise.

simonw · 2025-03-03T15:17:23 1741015043

I'm pretty sure Gemini (and likely other models too) have been deliberately engineered to avoid outputting exact lyrics, because the LLM labs know that the music industry is extremely litigious.

I'm surprised it didn't outright reject your request to be honest.

HankStallone · 2025-03-03T17:33:22 1741023202

I wondered if it'd been banned from looking at those sites. If that's commonly known (I've only started dabbling in this stuff, so I wasn't aware of that), it's interesting that it didn't just tell me it couldn't do that, instead of lying and giving false info.

krupan · 2025-03-03T17:42:09 1741023729

"it's interesting that it didn't just tell me it couldn't do that, instead of lying and giving false info."

Interesting is a very kind word to use there

boesboes · 2025-03-03T13:25:16 1741008316

I did the exact same today! It started out reasonable, but as you iterate on the commits/PR it become complete crap. And expensive too for very little value.

Terr_ · 2025-03-03T20:48:27 1741034907

> With humans you can be reasonably sure they've followed through with a mostly consistent level of care and thouhht.

And even if they fail, other humans are more likely to fail in ways we are familiar-with and can internally model and anticipate ourselves.