Dario has stated multiple times he doesn't believe there is any value in open-we...

danny_codes · 2026-03-19T20:34:31 1773952471

This exactly. Kimi 2.5 has coding performance hardly discernible from Claude. The only way to maintain a business edge is to crush open source clients to force people into a closed ecosystem. Once there, create context moat where people are not in control of their own context data (cannot export it to open tooling). Maybe we can call it the Oracle play?

It’ll be interesting to see if companies get tricked. I think it’s inevitable that it goes like MySQL/Postgres, where the open tools gets way better

manacit · 2026-03-19T21:53:23 1773957203

This is, I'm sorry to say, simply not true. Anthropic and Open AI are materially ahead of every open source model out there at this time. The best they can hope to do is be Sonnet-adjacent, and even then I have not seen it.

otabdeveloper4 · 2026-03-19T22:07:30 1773958050

> are materially ahead of every open source model out there at this time

They aren't. Any difference is in sampling parameters and post-training flavor choices. These aren't things that are "materially ahead", that's basically just LLM themes.

achompas · 2026-03-19T22:25:14 1773959114

I’m sorry but you’re demonstrably incorrect.

Listen, I want more open weight models in the world. They create entrepreneurial opportunities and support use cases which the foundation labs don’t want to support.

But open weight models are consistently three to six months behind on performance compared to closed models, as confirmed by both benchmarks and personal use. They’re closer on coding and much further away on non-coding tasks.

There are theories as to why these models lag, which I won’t get into. But anyone claiming open-weight models are close to closed-weight models is ignoring significant evidence to the contrary.

otabdeveloper4 · 2026-03-20T05:34:23 1773984863

> three to six months behind on performance

Yeah, like I said - it's just a post-training difference. That's not a material difference, that's a difference of chrome and polish.

ninjagoo · 2026-03-19T22:39:03 1773959943

> I’m sorry but you’re demonstrably incorrect.

Please so demonstrate?

achompas · 2026-03-20T00:07:30 1773965250

The onus isn’t on me. It’s on anyone contradicting findings by most benchmarks, because most of them show a clear advantage for Opus and GPT over OSS models.

ninjagoo · 2026-03-20T00:36:54 1773967014

So Big Claim No Demonstration? :-)

orf · 2026-03-19T23:25:40 1773962740

I mean just use them and compare, the gap is obvious.

otabdeveloper4 · 2026-03-20T05:36:33 1773984993

I did, and I fixed Qwen's issues with trivial sampling and loop detection hacks.

If I can do this, then a company that wants to sell local models seriously could do it too.

ninjagoo · 2026-03-20T13:02:56 1774011776

> I did, and I fixed Qwen's issues with trivial sampling and loop detection hacks.

Wow, that's amazing! Care to share the changes? Would love to try them out.

otabdeveloper4 · 2026-03-20T16:27:33 1774024053

It's not amazing at all.

What's amazing is that LLM technologies are so immature that even basic engineering diligence isn't being done. (Like detecting token loops, for example.)

mark_l_watson · 2026-03-19T22:43:19 1773960199

I agree about Kimi 2.5. Also, MiniMax M2.7 that just dropped is amazing, and it is just a 200G MOE model and inference is very fast. I tried using MiniMax M2.7 twice today as the backend for Claude Code and it did very well for both existing Python and Common Lisp projects. I will try MiniMax M2.7 next as the backend for OpenCode.

icedchai · 2026-03-19T22:51:31 1773960691

If you believe benchmarks, maybe this is true. But I've done my own experiments and it is absolutely not the case for real world usage. The quality of output from Claude (Sonnet) was much higher than Kimi K2.5.

theshrike79 · 2026-03-19T20:57:36 1773953856

Which "Claude"? Sonnet, Opus? With which harness are you comparing the coding performance?

Nowadays the harness matters more than the model itself. For example pi.dev + GPT5-codex is a lot smarter than plain codex cli

noemit · 2026-03-20T11:18:19 1774005499

I agree. Kimi K2.5 is basically a clone. But also GPT 5.4 has replaced Opus for me. I have unlimited access to all of the proprietary models so I really only make the choice based on convenience.

extr · 2026-03-19T20:56:22 1773953782

K2.5 is dog shit compared to leading OAI/Ant models.

tim-star · 2026-03-19T20:51:41 1773953501

thats only because kimi 2.5 was trained using data stolen from claude. it wouldnt exist without riding claudes coat tails. none of the so called 'open source' models would

hirako2000 · 2026-03-19T21:06:03 1773954363

That's not true, some open weight models didn't distill Claude or other then frontier models. E.g Llama. Yet achieved comparable performance (back then in llama's case).

If distillation wasn't a thing, they would certainly exist, they would have trained them from scratch or via a decent base models to remain economically viable.

What's for sure is that Claude wouldn't exist if it wasn't for data stolen from millions of creators. As they found themselves admittedly guilty of.

deaux · 2026-03-20T10:02:05 1774000925

> E.g Llama. Yet achieved comparable performance (back then in llama's case).

At not a single point in history did Llama ever achieve comparable real-world performance to frontier models. I was around. At best they were earlier at benchmaxxing than the others.

einr · 2026-03-19T21:12:48 1773954768

Boo hoo. Claude was trained using data stolen from the collective works of all of humanity. If someone does it faster and cheaper by skimming the cream off the top of Claude then surely that’s just a market efficiency in the thieves business?

shimman · 2026-03-19T20:31:42 1773952302

The dude built a mass plagiarism machine and wants to now profit off of his mass plagiarism machine, of course he's going to have antidemocratic ideas regarding people + technology.

hirako2000 · 2026-03-19T21:10:47 1773954647

His engineers told him "I don't write code anymore. Claude writes the code, I edit it, I code around it".

In 6 months, people won't work anymore. They will all use my products, outsource the thinking, why bother.

Oh and open weight models have no value...

There is a paper out there showing 30% of CEOs/C-suite have some psychopath tendencies. Not sure if they even used the term narcissistic , but I would add delusional.