Hacker News new | past | comments | ask | show | jobs | submit login

>Part of the reason LLMs aren't that big in the grand scheme of things is because they haven't been good enough and businesses haven't started to really adopt them. That will change, but the costs will be high because they're also extremely expensive to run. I think the author is focusing on the training costs for now, but that will likely get dwarfed by operational costs. What then?

Now maybe I'm naive somehow because I'm a machine-learning person who doesn't work on LLMs/big-ass-transformers, but uh... why do they actually have to be this large to get this level of performance?




Dunno! It could be the case that there just needs to be a trillion parameters to be useful enough outside of highly-constrained scenarios. But I would certainly challenge those who work on LLMs to figure out how to require far less compute for the same outcome.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: