Singular focus on AlpacaEval feels a bit limiting to validate the gains. What's ... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

code51 on Jan 25, 2024 | parent | context | favorite | on: Self-rewarding-lm-PyTorch: Self-Rewarding Language...

Singular focus on AlpacaEval feels a bit limiting to validate the gains.

What's the evidence here that this is not just a kind of leaderboard hacking for LLMs?

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact