Meta has released the model weights for OPT-175B, which is used in the paper. Th...

neilmovva · on Feb 20, 2023

While OPT-175B is great to have publicly available, it needs a lot more training to achieve good results. Meta trained OPT on 180B tokens, compared to 300B that GPT-3 saw. And the Chinchilla scaling laws suggest that almost 4T tokens would be required to get the most bang for compute buck.

And on top of that, there are some questions on the quality of open source data (The Pile) vs OpenAI’s proprietary dataset, which they seem to have spent a lot of effort cleaning. So: open source models are probably data-constrained, in both quantity and quality.

Miraste · on Feb 21, 2023

OPT-175B isn't publicly available, sadly. It's available to research institutions, which is much better than "Open"AI, but it doesn't help us hobbyists/indie researchers much.

generalizations · on Feb 21, 2023

I wonder when we'll start putting these models on the pirate bay or similar. Seems like an excellent use for the tech. Has no one tried to upload OPT-175B anywhere like that yet?

cma · on Feb 21, 2023

It could go on the clear net since trained weights aren't subject to copyright.

mensetmanusman · on Feb 20, 2023

It’s fun to think about a few billion weights being the difference between useless and gold.

tiborsaas · on Feb 20, 2023

Looking at my bank account I can relate :)

stavros · on Feb 20, 2023

Are there any that perform anywhere close to GPT-3?

jejeyyy77 · on Feb 21, 2023

Stable Diffusion was behind DALLE, but has since surpassed it. Don't hear anyone talking about DALLE anymore.