From what I've seen, using these huge models for inference at any kind of scale is expensive enough that it's difficult to find a business case that justifies the compute cost.
Those models aren't trained with the objective of being deployed in production.
They are trained to be used as teachers during distillation into smaller models that fit the cost/latency requirements for whatever scenario those big companies have. That's where the real value is.