Maybe they also do that, but I work with a class of problems* that no other model has managed to crack, except for R1 and that is still the case today.
Remember that DeepSeek is the offshoot of a hedge fund that was already using machine learning extensively, so they probably have troves of high quality datasets and source code repos to throw at it. Plus, they might have higher quality data for the Chinese side of the internet.
* Of course I won't detail my class of problems else my benchmark would quickly stop being useful. I'll just say that it is a task at the undergraduate level of CS, that requires quite a bit of deductive reasoning.
This is completely irrelevant without knowing if you are effectively prompting each model. Your workflow may just be suitable for a particular model and not others. And tuning a workflow for each model is tedious. I seriously doubt there is ANY class of problem DSR1 can solve that OAI's third tier model can't at this point (o4-mini).
Remember that DeepSeek is the offshoot of a hedge fund that was already using machine learning extensively, so they probably have troves of high quality datasets and source code repos to throw at it. Plus, they might have higher quality data for the Chinese side of the internet.
* Of course I won't detail my class of problems else my benchmark would quickly stop being useful. I'll just say that it is a task at the undergraduate level of CS, that requires quite a bit of deductive reasoning.