It's often vastly more expensive to inference, but vastly cheaper and faster to ...

It's often vastly more expensive to inference, but vastly cheaper and faster to train / set up.

Many LLM use cases could be solved by a much smaller, specialized model and/or a bunch of if statements or regexes, but training the specialized model and coming up with the if statements requires programmer time, an ML engineer, human labelers, an eval pipeline, ml ops expertise to set up the GPUs etc.

With an LLM, you spend 10 minutes to integrate with the OpenAI API, and that's something any programmer can do, and get results that are "good enough".

If you're extremely cash-poor, time-rich and have the right expertise, making your own model makes sense. Otherwise, human time is more valuable than computer time.