Hacker News new | past | comments | ask | show | jobs | submit login

I have to assume that targeted/curated LLM training sets will have a tendency to be less accurate than very general, just by the very nature of how they work.

(edited for clarity)




I know it's not quite analogous, but I fine-tuned GPT-3 on a small (200 examples) data set and it performed extremely poorly compared to the untrained version.

This surprised me, I thought it wouldn't do much better, but I wasn't expecting that specializing it on my target data would reduce performance! I had fewer examples than the minimum OpenAI recommends, so maybe it was a case of overfitting or something like that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: