It's often vastly more expensive to inference, but vastly cheaper and faster to train / set up.
Many LLM use cases could be solved by a much smaller, specialized model and/or a bunch of if statements or regexes, but training the specialized model and coming up with the if statements requires programmer time, an ML engineer, human labelers, an eval pipeline, ml ops expertise to set up the GPUs etc.
With an LLM, you spend 10 minutes to integrate with the OpenAI API, and that's something any programmer can do, and get results that are "good enough".
If you're extremely cash-poor, time-rich and have the right expertise, making your own model makes sense. Otherwise, human time is more valuable than computer time.
I'm in an AI focused education research group, and most "smart/personalized tutors" on the market have similar processes and outcomes as paper flashcards.
That was happening even when they were still calling it machine learning in the papers. Longer before that still. It’s the way some people reliably get papers out for better or worse. Find a known phenomenon with existing published methods, use the same dataset potentially using new method of the day, show there’s a little agreement between the old “gold standard” and your method, and boom, new paper for your cv on $hotnewmethod you can now land jobs with. Never mind no one will cite it. That’s not the point here.