If what you're saying is true, why are we hearing about AI companies wanting to build nuclear power plants to power new data centers they think they need to build?
Are you saying all of that new capacity is needed to power non-LLM stuff like classifiers, adtech, etc? That seems unlikely.
Had you said that inference costs are tiny compared to the upfront cost of training the base model, I might have believed it. But even that isn't accurate -- there's a big upfront energy cost to train a model, but once it becomes popular like GPT-4, the inference energy cost over time is dramatically higher than the upfront training cost.
You mentioned batch computing as well, but how does that fit into the picture? I don't see how batching would reduce energy use. Does "doing lots of work at once" somehow reduce the total work / total energy expended?
> If what you're saying is true, why are we hearing about AI companies wanting to build nuclear power plants to power new data centers they think they need to build?
Well, partly because they (all but X, IIRC) have commitments to shift to carbon-neutral energy.
But also, from the article:
> ChatGPT is now estimated to be the fifth-most visited website in the world
That's ChatGPT today. They're looking ahead to 100x-ing (or 1,000,000x-ing) the usage as AI replaces more and more existing work.
I can run Llama 3 on my laptop, and we can measure the energy usage of my laptop--it maxes out at around 0.1 toasters. o3 is presumably a bit more energy intensive, but the reason it's using a lot of power is the >100MM daily users, not that a single user uses a lot of energy for a simple chat.
> not that a single user uses a lot of energy for a simple chat.
This seems like a classic tragedy of the commons, no? An individual has a minor impact, but the rationale switching to LLM tools by the collective will likely have a massive impact.
>If what you're saying is true, why are we hearing about AI companies wanting to build nuclear power plants to power new data centers they think they need to build?
Something to temper this, lots of these AI datacenter projects are being cancelled or put on hiatus because the demand isnt there.
But if someone wants to build a nuke reactor to power their datacenter, awesome. No downsides? We are concerned about energy consumption only because of its impact on the earth in terms of carbon footprint. If its nuclear, the problem has already been solved.
AI seems like it is speedrunning all the phases of the hype cycle.
"TD Cowen analysts Michael Elias, Cooper Belanger, and Gregory Williams wrote in the latest research note: “We continue to believe the lease cancellations and deferrals of capacity points to data center oversupply relative to its current demand forecast.”"
Because training costs are sky-high, and handling an individual request still uses a decent amount of energy even if it isn't as horrifying as training. Plus the amount of requests, and content in them, is going up with stuff like vibe coding.
Are you saying all of that new capacity is needed to power non-LLM stuff like classifiers, adtech, etc? That seems unlikely.
Had you said that inference costs are tiny compared to the upfront cost of training the base model, I might have believed it. But even that isn't accurate -- there's a big upfront energy cost to train a model, but once it becomes popular like GPT-4, the inference energy cost over time is dramatically higher than the upfront training cost.
You mentioned batch computing as well, but how does that fit into the picture? I don't see how batching would reduce energy use. Does "doing lots of work at once" somehow reduce the total work / total energy expended?