I don't think those statements are contradictory at all. Making the thing is getting more expensive, but using it is getting cheaper. Electric cars could be a good analogy here, compared to an ICE, the upfront cost is higher, but once you have it, it's cheaper to use.
That doesn’t make sense though if scaling is actually stalling. The reason so much compute goes into training now is scaling, which keeps base model lifetime short.