Are you suggesting that it was commonplace for machine learning models to be able to extrapolate medical case reports into an actual diagnosis? Even being able to read and understand a case report is a minor miracle in extrapolation, and it’s interesting how far the goal posts move.
I have another example of ChatGPT not generalizing but just being a really good statistical model. I needed a solution to a problem that you can't find on Google, and a variation of a problem that also can't be found on Google. I attempted to obtain around 15 lines of code from ChatGPT that would solve the problem, but it consistently failed to produce the correct solution. I spent a few hours trying different prompts, indicating its errors and receiving apologies, only for it to generate another incorrect solution while acknowledging its mistake.
Solving out of distribution problems correctly seems almost impossible for it.
GPT 3.5 or 4? Surprisingly it makes a huge difference. I think a lot of peoples’ impressions are with 3.5, but many startups couldn’t have been built on it, whereas with 4 they can.
If it was 4 I’d be curious about the specific problem if you’d be willing to link to the chat.
I had similar issue with gpt4, was looking for a library that given a grammar and a string, would produce a list of next valid symbols.
Gpt4 only ever suggested grammar validator to the point I had given up and was going to write a grammar generator, and so I started looking for the equivalent of antlr in python, and in three searches I find out nltk.grammar that actually solves the original problem
It's not a new library either, so I'm dumbfounded by why gpt4 couldn't make sense of my request.
I tried this to see. It does suggest using validators, though in a way that mostly solves the problem. The failure mode is when the string is valid complete or extended. It does have a bit of a workaround for that.
No because if I add space to the original string I don't get a valid string
Or given the context space is a valid production from the last error, but then I would need to roll back the incomplete "do" terminal and lose that context.