Hacker News new | past | comments | ask | show | jobs | submit login

you might say that, but the literal benchmark of LLMs (or any supervised learning algorithm for that matter) is loss, or how much 'distance' is between their output, and the validation set, when seeded with the training set. With a loss of well below 1%, which is typical, it means it can pretty much recreate the training data.



I've not seen that number before, what paper is that where there's a loss of less than 1% on the validation/test set?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: