OpenAI has evidence that DeepSeek distilled OpenAI model after illegal access

rvz · 2025-01-29T11:57:06 1738151826

> As the leading builder of AI, we engage in countermeasures to protect our IP...

What do you mean "Our IP" for AI models, what about the stolen (copyrighted) data that was profited and taken from others without their permission, hence why news publishers all sued them?

At this point, the only valuable IP they legitimately have is the name "ChatGPT". That's it.

DeepSeek did more for humanity by giving it away for free than OpenAI did the moment they closed-up their models.

iugtmkbdfil834 · 2025-01-29T13:04:46 1738155886

The whole Wall Street reaction was fairly fascinating. I suffer no illusions about Chinese government, but like with Meta's move, it is bound to generate similar waves. I have already seen posts online asking to curb foss from China suggesting establishment was hoping this wouldn't happen ( and we can speculate as to why ).

latexr · 2025-01-29T11:27:51 1738150071

Sam Altman’s mask of pretending they are developing AI “for the good of humanity” continues to crumble.

42lux · 2025-01-29T11:30:18 1738150218

Your editorialized headline is false, misleading and technically not possible with only api access.

isaacfrond · 2025-01-29T15:25:25 1738164325

The headline may be false, but it accurately reflects the article.

The outlet’s sources said Microsoft security researchers detected that large amounts of data were being exfiltrated through OpenAI developer accounts in late 2024, which the company believes are affiliated with DeepSeek. OpenAI told the Financial Times that it found evidence linking DeepSeek to the use of distillation (...)

In fact, it is exactly the point of the article that more access was used than is provided in the API. Distillation is not possible with API access.

42lux · 2025-01-29T16:48:08 1738169288

If OpenAI says they walked away with the weights this is a possibility but it doesn't look like they have so it can't be a distill.

isaacfrond · 2025-01-29T18:36:23 1738175783

To distill a neural net the only thing you need is access to the softmax probabilities. You don’t need the weights. This is entirely consistent with the story—-large anounts of data being exfiltrated, not weights but the raw output before sampling.