Dario has stated multiple times he doesn't believe there is any value in open-weight models. Not surprised. This is not the behavior of an innovative company - it is fear-driven. They are seeing a rapidly shrinking moat.
This exactly. Kimi 2.5 has coding performance hardly discernible from Claude. The only way to maintain a business edge is to crush open source clients to force people into a closed ecosystem. Once there, create context moat where people are not in control of their own context data (cannot export it to open tooling). Maybe we can call it the Oracle play?
It’ll be interesting to see if companies get tricked. I think it’s inevitable that it goes like MySQL/Postgres, where the open tools gets way better
This is, I'm sorry to say, simply not true. Anthropic and Open AI are materially ahead of every open source model out there at this time. The best they can hope to do is be Sonnet-adjacent, and even then I have not seen it.
> are materially ahead of every open source model out there at this time
They aren't. Any difference is in sampling parameters and post-training flavor choices. These aren't things that are "materially ahead", that's basically just LLM themes.
Listen, I want more open weight models in the world. They create entrepreneurial opportunities and support use cases which the foundation labs don’t want to support.
But open weight models are consistently three to six months behind on performance compared to closed models, as confirmed by both benchmarks and personal use. They’re closer on coding and much further away on non-coding tasks.
There are theories as to why these models lag, which I won’t get into. But anyone claiming open-weight models are close to closed-weight models is ignoring significant evidence to the contrary.
The onus isn’t on me. It’s on anyone contradicting findings by most benchmarks, because most of them show a clear advantage for Opus and GPT over OSS models.
What's amazing is that LLM technologies are so immature that even basic engineering diligence isn't being done. (Like detecting token loops, for example.)
I agree about Kimi 2.5. Also, MiniMax M2.7 that just dropped is amazing, and it is just a 200G MOE model and inference is very fast. I tried using MiniMax M2.7 twice today as the backend for Claude Code and it did very well for both existing Python and Common Lisp projects. I will try MiniMax M2.7 next as the backend for OpenCode.
If you believe benchmarks, maybe this is true. But I've done my own experiments and it is absolutely not the case for real world usage. The quality of output from Claude (Sonnet) was much higher than Kimi K2.5.
I agree. Kimi K2.5 is basically a clone. But also GPT 5.4 has replaced Opus for me. I have unlimited access to all of the proprietary models so I really only make the choice based on convenience.
thats only because kimi 2.5 was trained using data stolen from claude. it wouldnt exist without riding claudes coat tails. none of the so called 'open source' models would
That's not true, some open weight models didn't distill Claude or other then frontier models. E.g Llama. Yet achieved comparable performance (back then in llama's case).
If distillation wasn't a thing, they would certainly exist, they would have trained them from scratch or via a decent base models to remain economically viable.
What's for sure is that Claude wouldn't exist if it wasn't for data stolen from millions of creators. As they found themselves admittedly guilty of.
> E.g Llama. Yet achieved comparable performance (back then in llama's case).
At not a single point in history did Llama ever achieve comparable real-world performance to frontier models. I was around. At best they were earlier at benchmaxxing than the others.
Boo hoo. Claude was trained using data stolen from the collective works of all of humanity. If someone does it faster and cheaper by skimming the cream off the top of Claude then surely that’s just a market efficiency in the thieves business?
The dude built a mass plagiarism machine and wants to now profit off of his mass plagiarism machine, of course he's going to have antidemocratic ideas regarding people + technology.
His engineers told him "I don't write code anymore. Claude writes the code, I edit it, I code around it".
In 6 months, people won't work anymore. They will all use my products, outsource the thinking, why bother.
Oh and open weight models have no value...
There is a paper out there showing 30% of CEOs/C-suite have some psychopath tendencies. Not sure if they even used the term narcissistic , but I would add delusional.