Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Two things stand out. Stable Diffusion; the weights for it and the supporting Python code was open sourced and released to the public. Anyone can find the cpkt file and the Python code online and download it and (with a bit of work) run it on their own computer if they have the hardware (any reasonable GPU) for it. There has been no such release by OpenAI, the closest we got is Facebook's leaked LLama model, and that's not a chat bot. So we don't have a model to run. Stability AI/Emad paid to train the model, which cost like a half million dollars in GPU time (he obviously didn't pay the retail price of $600k, but also it's not something you'd get right on the first shot either) and then gave the output from that away. It's not clear how much it would cost to train a comparable chat bot to ChatGPT but the impression is that would take much more.

The second thing is that it's not clear that we, the Internet at large would actually benefit from the model's release. StableDiffusion is 4 gigs and able to run on all sorts of consumer grade hardware, leading to such a Renaissance. ChatGPT makes liberal use of Nvidia A100 GPUs that are available to them to use as compute in Azure. (AWS and GCP, along with many AI focused smaller cloud companies also offer these.) One of those costs, like, $10k. And you need several of them to be able to run ChatGPT. Which means even if OpenAI were to live up to their name and release ChatGPT's model, only businesses and research labs would actually have the hardware to run it, so it would be awesome to have the weights, but you wouldn't have the same army of developers able to work on it.

There are open source LLM chat bots out there, so I think we will see one become popular, but at least that says why it'll be a second before we do.



I have been fooling around with the small 7B llama models. They chat, but they are pretty dumb compared to ChatGPT. This means they are terser and they confabulate more, even for things that are common knowledge. It seems, from asking it questions about current events, that the model was trained up to data from early 2020.

I haven't seen much output yet from the biggest 65B parameter llama model. One can rent cloud VMs that can run it for $1.25 an hour or so on vast.ai to run it, but ChatGPT is $20 a month so why bother, unless you like the fully uncensored aspect.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: