The term 'GPT' was first used in the BERT paper to refer to Generic Pre-trained ...

niutech · 2024-02-15T13:57:23 1708005443

BERT paper is from 11 Oct 2018 and the OpenAI paper it was referring to "Improving Language Understanding by Generative Pre-Training" is from 11 Jun 2018: https://cdn.openai.com/research-covers/language-unsupervised...

lolinder · 2024-02-15T14:10:54 1708006254

Yes, but that paper that you link to doesn't ever call it GPT or Generative Pretrained Transformers. It talks about training Transformers with Generative Pretraining, both of which are pre-existing concepts by this point.

I also looked on the OpenAI website in Sep 2018 and could find no reference to GPT or Generative Pretrained Transformers, so I think OP might be right about BERT using it first.

http://web.archive.org/web/20180923011305/https://blog.opena...