Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Looks like a cool paper. It's really puzzling to me why llama turned out to be so bad, yet they're releasing great research. Especially considering the amount of GPUs, llama really seems unexcusable when compared to Loma from much smaller teams with a lot less resources


Llama will advance further just like the rest. The leaderboards for llms is just a constantly changing thing. They will all reach a maturity point and be roughly the same. That's probably something we'll see soon in the next 1-3 years tops. Then it'll just be incremental price drops in the cost to train and run, but the quality will all be equatable. Not to mention we're already running out of training data.


It was always about more than GPUs since even when the original llama came out, the community released fine tunes that would bench higher than the base model. And with the Deepseek distilled models it turned out you could fine tune some reasoning into a base model and make it perform better.


Meta seemingly has different AI divisions, and I feel like the cool research doesn't usually come from their GenAI department.


Llama 4 felt more like a Mistral release. SoTAish but using fewer resources.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: