Hacker News new | past | comments | ask | show | jobs | submit login

You have to distinguish between the current model and DeepSeek the company. DeepSeek the company can do an OpenAI and stop releasing their weights any time they like. The knowledge and skill is retained.

I really wonder how long the current era of giving models away for free can last. How is this sensible from a business perspective? Facebook got burned by iOS and now engage in what would otherwise look like irrational behavior to avoid being locked into a supplier again, but even then, they don't really need to give Llama away for free. They could train and use it for themselves just fine.




If they're smart, and of course they are, they're not releasing the latest they have. They're releasing something enough to show everyone that they're at parity or better compared to OpenAI. I imagine they already have internal models that exceed the open source one, so there's no real advantage in copying what they released.


Open models will win, OpenAI and the other regulatory capture gamers that want to hoard their precious will certainly be an interesting footnote for the history books.


You don't think FB are trying to neuter an emerging threat? They're kneecapping what could have been a trillion dollar company if it was more difficult to replicate their tech.


They would have to pay more to get researchers that don't publish.


I was talking about open weight models more than papers, but OpenAI hardly publishes papers anymore and don't seem to struggle to get researchers. Anthropic are clearly doing a lot of special sauce given Claude 3.5 Sonnet's performance on coding, yet the papers they publish are mostly safety related. So I'm not sure that's really true anymore.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: