Hacker News new | past | comments | ask | show | jobs | submit login

I think you didn't understand my question and maybe I phrased it poorly. The problem is not whether we should trust any deep learning model (the answer is indeed no). But the question is how we can find out if a model is any good before investing our time into that model. Each bad reply we get has a price, because it wastes our time. So, how can we compare models objectively without having to try them out ourselves first?



There are leaderboards [1] that can provide a rough estimate of the relative capabilities of different models.

[1] https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboar...




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: