LLM is far from the highest AI related cost, so we basically don't care about op...

LLM is far from the highest AI related cost, so we basically don't care about optimizing LLMs.

Obviously we don't use the super expensive ones like GPT4.5 or so. But we don't really bother with mini models, because GPT4.1 etc.. are cheap enough.

Stuff like speech to text etc.. are still way more expensive, and yes there we do focus on cost optimization. We have no large scale image generation use cases (yet)