Yeah this is a bizarre statement considering that GPT-3 is definitely not any of those things really. GPT-3 is far, far too computationally expensive to have value in industry. A linear CRF is more useful than most NN approaches in industry right now, just simply because in many circumstances you want to have something that you can apply to a few billion documents and get the result within a few hours, then tweak a few things and repeat if you like. These simple models also have the ability to be predictable as well. Some transformer or lstm methods can be useful in industry, but it really depends on the application. I certainly would not be using GPT-like systems for much in industry, other than gimmicks for marketing.
GPT-3 is useful for academia - not industry.
Hm? GPT-3 is relatively cheap to inference from, at least compared to the cost of training. You can load all the params onto a single TPU, actually. (A TPU can allocate up to 300GB on its CPU without OOM'ing.)
AI dungeon is also powered by GPT-3, and it's quite snappy. I'm not sure why GPT-3 is seen as computationally expensive, but it seems workable.
GPT-3 is not that expensive. Estimating from the paper, to train the model, the GPU hardware costs were a few million dollars, and the electricity costs were probably under 100k. This is totally feasible for many companies today, especially if the hardware is a fixed cost and can be reused for training multiple models.
And as mentioned elsewhere, inference for a trained model is much, much cheaper.