I don't think it's necessarily about DeepSeek, but about the wider competitive picture. There are two tacit assumptions being made about LLMs - that having a SOTA model is a substantial competitive advantage, and that the demand for compute will continue to grow rapidly.
DeepSeek's phenomenal success in reducing training and inference cost points to the possibility of a very different future. If it's the case that SOTA or near-SOTA performance is commoditised and progress in efficiency outpaces progress in capability, then the roadmap looks radically different. If DeepSeek don't have a competitive advantage, then no-one has a competitive advantage. Having a DC full of H200s or a proprietary model with a trillion parameters might not count for anything, in which case we're looking at a very different set of winners and losers. Application specific fine-tuning and product-market fit might matter much more than brute force compute.
Isn't this the nature of past technology developments? few tech companies have a true technical "moat" - In California, the employees of any firm are free to raise funds and start a competitor the moment they are dissatisfied with the current leadership/compensation/location. During my career I have yet to observe a "secret sauce" that took more than a few weeks to learn and understand once on the inside.
The technical moats we know of in B2B have typically come from a combination of a large number of features efficiently tied into a platform/service that would be cost prohibitive to replicate (ElasticSearch, most successful Database firms), a network effect around that platform the makes it difficult not to be on the platform (CUDA, x86, windows).
>> 3. Most importantly, deepseek is open source, which means that the other models are free to copy whatever secret source it has, eg: Whatever architecture that purportedly use less compute can easily be copied.
> I don't think it's necessarily about DeepSeek, but about the wider competitive picture. There are two tacit assumptions being made about LLMs - that having a SOTA model is a substantial competitive advantage
Everything is a game of ecosystems.
Windows lost to Linux on servers because it was cheap and easy to deploy Linux. Thousands of engineers and companies could build in the Linux playground for free and do whatever they wanted, whereas Windows servers were restrictive and static and costly.
Dall-E lost to Stable Diffusion and Flux because the latter were open source. You could fine tune them on your own data, run them on your own machine, build your own extensions, build your own business. ComfyUI, IPAdapter, ControlNet, Civitai... It's a flourishing ecosystem and Dall-E is none of that.
It'll happen with LLMs (Llama, Qwen, DeepSeek), video models (Hunyuan, LTX), and quite possibly the whole space.
One company can only do so much, and there is no real moat. You can't beat the rest of society once they overcome the activation energy.
And any third place player will be compelled to open source their model to get users. Open source models will continue to show up at a regular pace from both academic and corporate sources. Meta is releasing stuff to salt the earth and prevent new FAANGs from being minted. Commoditizing their complement.
> If DeepSeek don't have a competitive advantage, then no-one has a competitive advantage.
There is no moat. Smaller models are just a few months behind large proprietary ones. But the distribution of tasks might be increasingly solvable with smaller models, leaving little for the top models which are also more expensive.
DeepSeek's phenomenal success in reducing training and inference cost points to the possibility of a very different future. If it's the case that SOTA or near-SOTA performance is commoditised and progress in efficiency outpaces progress in capability, then the roadmap looks radically different. If DeepSeek don't have a competitive advantage, then no-one has a competitive advantage. Having a DC full of H200s or a proprietary model with a trillion parameters might not count for anything, in which case we're looking at a very different set of winners and losers. Application specific fine-tuning and product-market fit might matter much more than brute force compute.