> a modern server uses 50-100 watts idle doing nothing
I'm really tired of hearing this. "Serverless because otherwise server doing nothing", "very small virtual machine because otherwise server doing nothing".
The server is not doing "nothing" it's waiting for incoming requests. It's like if you told "this cashier is doing nothing because there is no customers in the store".
When a server is loaded at capacity minus some margin, latencies are going up, which may not always be acceptable. Also, not every web workload scales linearly nor is cacheable and traffic patterns may not be that predictable and some requests may generate higher loads.
Managing capacity is way more involved that just "this server is doing nothing".
Also, many of these technologies supposedly reducing "idle time" such as "serverless" are usually incredibly wasteful where handling a single request may start a completely new environment and may pull resources across the globe.
If there are 100 servers but only one is needed to handle the user traffic, then 99% of those servers are considered to be "doing nothing" even if they are powered on and running software. At the end of the day, running that software is meaningless to the business and to customers.
I think the point was that "ready and waiting" is valuable to the end customer, even if it only makes a different later when they are doing something. It's kind-of like how firemen are valuable even when they are not getting calls, because they are available for low latency response instead of busy doing something else. The idea that this is just wasted computation is therefore somewhat disingenuous.
Oh, but it could be improved. Linux can cold boot under 300ms (easier if you control the BIOS and can tune it for speed, like coreboot can), faster if resuming from RAM. That should allow you to perform load-balancing while powering off the extra capacity (using wake-on-lan).
If load becomes too important for the SBC or close to capacity, wake the server, and perform a handover once it's up. You can either hold the packets and use the SBC as a proxy, or change your router's config to point to the newly awakened server (alternatively, just exchange IP or MAC addresses). With a bit of magic to avoid closing existing connections (I believe home ISP routers should keep NAT connections open if a port forward is changed), it would work. Obviously it's even easier with a proper load balancer.
edit: actually even a router might be able to handle low loads
So yeah, it's still wasted power and computation in my opinion. "Ready and waiting" should not take 100W per server, but be closer to 0.1W (WoL), or lower if managing state from a central node. I guess it's not worth optimizing for most people, and big cloud probably does something similar already.
In a way, it's a bit like big.LITTLE with additional latency: small, power-efficient vs big, fast, inefficient for small loads.
Modern CPUs go to lower power states super quickly and draw almost nothing. The thing is, if the server is running many VMs, there's no way it's going to a low power state, eveb if some are doing nothing (others will be). You also have 10 jet engines blowing air at the front, which probably is more than the CPU uses when both are idle.
Totally agreed that 100w is a waste at idle, but I don’t think that’s what the parent was talking about. My read was that the parent comment was responding to capacity planning tending towards reducing the number of servers (without building out low latency cold boot infra) in the name of cost/energy savings, and that resulting in higher latency request latency. Anecdotally this seems plausible, but I don’t have metrics to back it up.
Still, if you switch 100 servers with 100 owners all waiting for connections for 1 server hosting 100 sites and 99 other in a low power mode waiting for traffic, you save a lot of power and doesn't lose much.
Anyway, it would be waste even if you couldn't save it at all. "Waste" is simply a name for things we consume but don't actually use. All industries use that term.
I'm really tired of hearing this. "Serverless because otherwise server doing nothing", "very small virtual machine because otherwise server doing nothing".
The server is not doing "nothing" it's waiting for incoming requests. It's like if you told "this cashier is doing nothing because there is no customers in the store".
When a server is loaded at capacity minus some margin, latencies are going up, which may not always be acceptable. Also, not every web workload scales linearly nor is cacheable and traffic patterns may not be that predictable and some requests may generate higher loads.
Managing capacity is way more involved that just "this server is doing nothing".
Also, many of these technologies supposedly reducing "idle time" such as "serverless" are usually incredibly wasteful where handling a single request may start a completely new environment and may pull resources across the globe.