>June 26 (Reuters) - Chinese AI startup DeepSeek has not yet determined the timing of the release of its R2 model as CEO Liang Wenfeng is not satisfied with its performance,
>Over the past several months, DeepSeek's engineers have been working to refine R2 until Liang gives the green light for release, according to The Information.
But yes, it is strange how the majority of the article is about lack of GPUs.
I am pretty sure that the information has no access to / sources at Deepseek. At most they are basing their article on selective random internet chatter amongst those who follow Chinese ai.
Presumably there is a CEO statement somewhere. If DeepSeek said May, but it is almost July, that would call for some comment from them.
Although I'd like to know the source for the "this is because of chip sanctions" angle. SMIC is claiming they can manufacture at 5nm and a large number of chips at 7nm can get get the same amount of compute of anything Nvidia produces. It wouldn't be market-leading competitive but delaying the release for a few months doesn't change that. I don't really see how DeepSeek production release dates and the chip sanctions could be linked in the small. Unless they're just including that as an aside.
It is pretty strange that DeepSeek didn't say May anywhere, that was also a Reuters report based on "three people familiar with the company".[1] DeepSeek itself did not respond and did not make any claims about the timeline, ever.
The phrasing for quoting sources is extremely codified, it means the journalists have verified who the sources are (either insider or people with access with insider information).
Sure, if you don't trust anything what's the point. There's a lot of information that relies on anonymous sources and we usually use third party to vet them (otherwise how would they stay anonymous). Without this system we'd be missing out on a lot of things (if only named sources are used, a lot of things would never come out).
(A lot of things break down in society without trust, maybe that's already how the US is? Where I live it is thankfully still somewhat ok)
The Washington Post, The New York Times, The New Republic, The Intercept, Rolling Stone, CBS News, CNN, Newsweek, USA Today, NBC News, Der Spiegel (Germany), The Sunday Times (UK), Daily Mail (UK), Al Jazeera (Qatar), RT (Russia), Xinhua (China), Press TV (Iran), Haaretz (Israel), Le Monde (France), El País (Spain) all have been caught using fake anonymous sources.
Welcome to most China news. Many "well-documented" China "facts" are in fact cases like this: the media taking rumors or straight up fabricating things for clicks, and then self-referencing (or different media referencing each other in a circle) to put up the guise of reliable news.
This is why we need to be critical of journalists nowadays. No longer are they the Fourth Column, protecting society and democracy by providing accurate information.
That sounds to me like you are excusing a bad reality based on a nonexistant ideal. Saying "there are bad journalists" is a huge understatement. There are many, perhaps even the majority. Ask yourself why society at large has stopped trusting mainstream media, it's not just because there are a "few" bad apples but because the bad apples are widespread and systemic.
The tendency to compare to a nonexistant ideal is also something I find very very weird. This tendency does not exist for many other concepts. For example when people talk about communism, and someone say "hey $COUNTRY is just one bad apple, it doesn't mean real communism is bad" then others are quick to respond with "but all countries doing communism have devolved into tyranny/dictatorship/etc, so real communism doesn't exist and what we've seen is the real deal". I am not criticizing that (common) point of view, but people ought to take responsibility and apply this principle equally to all concepts, including "journalism".
It also doesn't follow that my critique of journalists/journalism means tearing down journalism altogether. It can also mean:
- that people need to stop trusting mainstream journalists blindly on topics they're not adept in. Right now many people have stopped trusting mainstream journalists only for topics they're adept in, but as soon as those journalists write nonsense about something else (e.g. $ENEMY_STATE) then they swallow that uncritically. No. The response should be "they lied about X, what else are they lying about?" instead of letting themselves be manipulated in other areas.
- that society as a whole needs to hold journalism accountable, and demand that they return to the role of the Fourth Column.
> Ask yourself why society at large has stopped trusting mainstream media
Because certain political interests take the existence of a fact-based, independent power center as a threat to their own power?
And so engineered a multi-decade campaign to indoctrinate people against the news/media, thus removing a roadblock to imposing their own often contrary-to-fact narratives?
Pretending this happened in a vacuum or was grassroots ignores mountains of money deployed with specific intent over spans of time.
> It can also mean that society as a whole needs to hold journalism accountable, and demand that they return to the role of the Fourth Column.
I absolutely agree with this.
If I had my druthers, the US would reinstate the fairness doctrine (abolished in 1987) and specifically the components requiring large media corporations to subsidize non-profit newsrooms as a public good.
The US would be a better place if we banned 24/7 for-profit news.
>Reporting by Deborah Sophia in Bengaluru; Editing by Arun Koyyur
Kek. Reminder after Sino India drama, India has basically 0 accredited journalist in China. The chances of Indian journalist "citing two people with knowledge of the situation" in Deepseek in Bengalurur before it's spreads over PRC rumor mill is vanishingly small.
Yes. And those random Internet chatter almost certainly doesn't know what they are talking about at all.
First, nobody is training on H20s, it's absurd. Then their logic was, because of high inference demand of DeepSeek models there are high demand of H20 chips, and H20s were banned so better not release new model weights now, otherwise people would want H20s harder.
Which is... even more absurd. The reasoning itself doesn't make any sense. And the technical part is just wrong, too. Using H20 to serve DeepSeek V3 / R1 is just SUPER inefficient. Like, R1 is the most anti-H20 model released ever.
The entire thing makes no sense at all and it's a pity that Reuters fall for that bullshit.
MLA uses way more flops in order to conserve memory bandwidth, H20 has plenty of memory bandwidth and almost no flops. MLA makes sense on H100/H800, but on H20 GQA-based models are a way better option.
Not sure what you are referring to—do you have a pointer to a technical writeup perhaps? In training and inference MLA has way less flops than MHA, which is the gold standard, and way better accuracy (model performance) than GQA (see comparisons in the DeepSeek papers or try deepseek models vs llama for long context.)
More generally, with any hardware architecture you use, you can optimize the throughput for your main goal (initially training; later inference) by balancing other parameters of the architecture. Even if training is suboptimal, if you want to make a global impact with a public model, you aim for the next NVidia inference hardware.
Didn't deep-seek figure out how to train with mixed precision and so get much more out of the cards, with a lot of the training steps able to run at what was traditionally post training quantization type precisions (block compressed).
>June 26 (Reuters) - Chinese AI startup DeepSeek has not yet determined the timing of the release of its R2 model as CEO Liang Wenfeng is not satisfied with its performance,
>Over the past several months, DeepSeek's engineers have been working to refine R2 until Liang gives the green light for release, according to The Information.
But yes, it is strange how the majority of the article is about lack of GPUs.