*Woah. I never gave it my database schema but it assumes I have a table called "...

vertis · on Jan 27, 2021

Nothing. It was a not-for-profit but it converted itself to a for-profit entity and made an exclusive deal with Microsoft for GPT-3 (not sure how it's exclusive given all the beta API users).

Granted training your own copy of GPT-3 would be beyond most peoples means anyway (I think I read an estimate that it was a multi-million dollar effort to train a model that big).

I do think it's a bit dodgy to not change the name though when you change the core premise.

derefr · on Jan 27, 2021

GPT-3 is the same "tech" as GPT-2, with more training. GPT-2 is FOSS. I have a feeling that OpenAI's next architecture (if there ever is one) would still also be FOSS.

I think OpenAI just chose a bad name for this for-profit initiative — "GPT-3" — that makes it sound like they were pivoting their company in a new direction with a new generation of tech.

Really, GPT-3 should have been called something more like "GPT-2 Pro Plus Enterprise SaaS Edition." (Let's say "GPT-2++" for short.) Then it would have been clear that:

1. "GPT-2++" is not a generational leap over "GPT-2";

2. an actual "GPT-3" would come later, and that it would be a new generation of tech; and

3. there would be a commercial "GPT-3++" to go along with "GPT-3", just like "GPT-2++" goes along with "GPT-2".

(I can see why they called it GPT-3, though. Calling it "GPT-2++" probably wouldn't have made for very good news copy.)

armoredkitten · on Jan 27, 2021

You make it sound as if GPT-3 is just the same GPT-2 model with some extra Enterprise-y features thrown in. They're completely different models, trained on different data, and vastly different sizes. GPT-2 had 1.5B parameters, and GPT-3 has 175B. It's two orders of magnitude larger.

Sure, both models are using the same structures (attention layers, mostly), so it's a quantitative change rather than a qualitative change. But there's still a hell of a big difference between the two.

derefr · on Jan 27, 2021

Right, but GPT-2 was the name of the particular ML architecture they were studying the properties of; not the name of any specific model trained on that architecture.

There was a pre-trained GPT-2 model offered for download. The whole "interesting thing" they were publishing about, was that models trained under the GPT-2 ML architecture were uniquely-good at transfer learning, and so any pre-trained GPT-2 model of sufficient size, would be extremely useful as a "seed" for doing your own model training on top of.

They built one such model, but that model was not, itself, "GPT-2."

Keep in mind, the training data for that model is open; you can download it yourself and reproduce the offered base-model from it if you like. That's because GPT-2 (the architecture) was formal academic computer science: journal papers and all. The particular pre-trained model, and its input training data, were just published as experimental data.

It is under that lens, that I call GPT-3 "GPT-2++." It's a different model, but it's the same science. The model was never OpenAI's "product." The science itself was/is.

Certainly, the SaaS pre-trained model named "GPT-3" is qualitatively different than the downloadable pre-trained base-model people refer to as "GPT-2." But so are all the various trained models people have built by training GPT-2 the architecture with their own inputs. The whole class of things trained on that architecture are fundamentally all "GPT-2 models." And so "GPT-3" is just one such "GPT-2 model." Just a really big, surprisingly-useful one.

ritter2a · on Jan 28, 2021

> Right, but GPT-2 was the name of the particular ML architecture they were studying the properties of; not the name of any specific model trained on that architecture.

That sounds like it would have been a reasonable choice for naming their research, but isn't the abbreviation "GPT" short for "Generative Pre-trained Transformer"? Seems like they very specifically refer to the pre-trained model, which I would also take from the GPT-2 paper's abstract: "Our largest model, GPT-2, is a 1.5B parameter Transformer[...]" [1]

--- [1] https://cdn.openai.com/better-language-models/language_model...

marcosdumay · on Jan 27, 2021

GPT-2 Community and GPT-2 Enterprise.

Those terms are so disseminated that I wouldn't be surprised if GPT-2 could suggest them.

derefr · on Jan 27, 2021

What I meant by my last statement is that no news outlet would have wanted to talk about "the innovative power of GPT-2 Enterprise." That just sounds fake, honestly. Every SaaS company wants to talk about the "innovative power" of the extra doodads they tack onto their Enterprise plans of their open-core product; where usually nobody is paying for their SaaS because of those doodads, but rather just because they want the service, want the ops handled for them, and want enterprise support if it goes down.

But, by marketing it as a new version of the tech, "GPT-3", OpenAI gave journalists something they could actually report on without feeling like they're just shoving a PR release down people's throats. "The new generation of the tech can do all these amazing things; it's a leap forward!" is news. Even though, in this case, it's only a "quantity has a quality all its own" kind of "generational leap."

frompdx · on Jan 27, 2021

That's disappointing.

OpenAI’s mission is to ensure that artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at most economically valuable work—benefits all of humanity.

Certainly makes that statement seem less credible.

greentrust · on Jan 27, 2021

What if, and bear with me, strong AI poses real dangers and open sourcing extremely powerful models to everyone (including malicious actors and dictatorial governments) would actually harm humanity more than it benefits it?

pretendscholar · on Jan 28, 2021

And now its restricted to only the malicious actors and dictatorial goverments that can buy it!

drusepth · on Jan 28, 2021

This is exactly the defense I kind of expect them to use when AGI finally is realized (whether or not it ends up true).

"Oh, we said we're open but this is too dangerous to publicly release; we'll be licensing it exclusively to approved customers instead."

MrGilbert · on Jan 27, 2021

> (including malicious actors and dictatorial governments) would actually harm humanity more than it benefits it?

I'm really glad that weapons aren't open source. Imagine every dictatorship would get their hands on weapons. Luckily, it's hidden behind a paywall. /s

gwern · on Jan 27, 2021

Incorrect. It is still a not-for-profit, which owns a for-profit entity. It is fairly common for charities to own much or all of for-profit entities (eg Hershey Chocolate, or in today's Matt Levine newsletter, I learned that a quarter of Kellogg's is still owned by the original Kellogg charity). And the exclusive deal was not for GPT-3, in the sense of any specific checkpoint, but for the underlying code.

jfrunyon · on Jan 27, 2021

- Charity is not the same as not-for-profit

- Hershey is a public company. Most certainly NOT owned by either a charity or a non-profit. The only way a non-profit comes into the picture is that a significant portion of their 'Class B' stock is owned by a trust which is dedicated to a non-profit (the Milton Hershey School). (https://www.thehersheycompany.com/content/dam/corporate-us/d... pp 36-37)

gwern · on Jan 27, 2021

Neither of your corrections are right, but thanks for providing a cite on how exactly a charity owns much of Hershey.

jfrunyon · on Feb 8, 2021

Both of my corrections are right, but thanks for doubling down on something that's false with no evidence.

- Charity vs non-profit: https://www.irs.gov/charities-non-profits/charitable-organiz...

> To be tax-exempt under section 501(c)(3) of the Internal Revenue Code, an organization must be organized and operated exclusively for exempt purposes set forth in section 501(c)(3) [CHARITY], and none of its earnings may inure to any private shareholder or individual [NON-PROFIT].

You can be a charity (albeit not tax exempt) without being a non-profit, and moreover you can be a non-profit without being a charity. (See also https://www.irs.gov/charities-non-profits/other-nonprofits ; and keep in mind that still other types of non-profits are not tax-exempt at all!)

- Trust "owning" Hershey's: If you look at the document I cited, you'll note that the trust (which is still neither a charity nor a non-profit!) owns only 5.5% of Hershey's common stock.

vertis · on Jan 27, 2021

I stand corrected.

swalsh · on Jan 27, 2021

I would love for an ACTUAL open AI platform, someone should build a SETI@Home like platform to allow normal people to aggregate their spare GPU time.

gumby · on Jan 27, 2021

> but what exactly is "open" about OpenAI?

Nothing. At this point it's simply openwashing.

JanisL · on Jan 28, 2021

I like this term just because it seems like it's got a lot more common lately to do this.

wayeq · on Jan 27, 2021

> Not related to the article but what exactly is "open" about OpenAI?

Microsoft's checkbook?

pizza · on Jan 27, 2021

Closed and open.. ClopenAI

renewiltord · on Jan 28, 2021

It's open in the sense anyone can use it. Until then, models of this power were internally available in megacorps alone.

vladsanchez · on Jan 27, 2021

No magic needed, only metadata. ;-)