You're right about the cost question, but I think the added dimension that people are worried about is the current pace of change.
To abuse the idiom a bit, yesterday's hardware should be able to run tomorrow's models, as you say, but it might not be able to run next month's models (acceptably or at all).
Fast-forward some number of years, as the pace slows. Then-yesterday's hardware might still be able to run next-next year's models acceptably, and someone might find that hardware to be a better, safer, longer-term investment.
I think of this similarly to how the pace of mobile phone development has changed over time. In 2010 it was somewhat reasonable to want to upgrade your smartphone every two years or so: every year the newer flagship models were actually significantly faster than the previous year, and you could tell that the new OS versions would run slower on your not-quite-new-anymore phone, and even some apps might not perform as well. But today in 2025? I expect to have my current phone for 6-7 years (as long as Google keeps releasing updates for it) before upgrading. LLM development over time may follow at least a superficially similar curve.
Regarding the equivalent EC2 instance, I'm not comparing it to the cost of a homelab, I'm comparing it to the cost of an Anthropic Pro or Max subscription. I can't justify the cost of a homelab (the capex, plus the opex of electricity, which is expensive where I live), when in a year that hardware might be showing its age, and in two years might not meet my (future) needs. And if I can't justify spending the homelab cost every two years, I certainly can't justify spending that same amount in 3 months for EC2.
I repeat: OP's home server costs as much as a few months of a cloud provider's infrastructure.
To put it another way, OP can buy brand new hardware a few times per year and still save money compared with paying a cloud provider for equivalent hardware.
> Regarding the equivalent EC2 instance, I'm not comparing it to the cost of a homelab, I'm comparing it to the cost of an Anthropic Pro or Max subscription.
OP stated quite clearly their goal was to run models locally.
> Fair, but at the point you trust Amazon hosting your "local" LLM, its not a huge reach to just use Amazon Bedrock or something
I don't think you even bothered to look at Amazon Bedrock's pricing before doing that suggestion. They charge users per input tokens + output tokens. In Amazon Bedrock, a single chat session involving 100k tokens can cost you $200. That alone is a third of OP's total infrastructure costs.
If you want to discuss options in terms of cost, the very least you should do is look at pricing.
You're right about the cost question, but I think the added dimension that people are worried about is the current pace of change.
To abuse the idiom a bit, yesterday's hardware should be able to run tomorrow's models, as you say, but it might not be able to run next month's models (acceptably or at all).
Fast-forward some number of years, as the pace slows. Then-yesterday's hardware might still be able to run next-next year's models acceptably, and someone might find that hardware to be a better, safer, longer-term investment.
I think of this similarly to how the pace of mobile phone development has changed over time. In 2010 it was somewhat reasonable to want to upgrade your smartphone every two years or so: every year the newer flagship models were actually significantly faster than the previous year, and you could tell that the new OS versions would run slower on your not-quite-new-anymore phone, and even some apps might not perform as well. But today in 2025? I expect to have my current phone for 6-7 years (as long as Google keeps releasing updates for it) before upgrading. LLM development over time may follow at least a superficially similar curve.
Regarding the equivalent EC2 instance, I'm not comparing it to the cost of a homelab, I'm comparing it to the cost of an Anthropic Pro or Max subscription. I can't justify the cost of a homelab (the capex, plus the opex of electricity, which is expensive where I live), when in a year that hardware might be showing its age, and in two years might not meet my (future) needs. And if I can't justify spending the homelab cost every two years, I certainly can't justify spending that same amount in 3 months for EC2.