> The biggest problem: developers don’t want GPUs. They don’t even want AI/ML mo...

> The biggest problem: developers don’t want GPUs. They don’t even want AI/ML models. They want LLMs.

No, I want GPU. BERT models are still useful.

The point is your service is too expensive that only one or two months of renting is enough to build a PC from scratch and place it somewhere in your workplace to run 24/7. For applications that need GPU power, usually downtime or latency does not really matter. And you always add an extra server to ensure.