If you go from 0 to 100K legit requests in the same instant, any sane architectu...

ndriscoll · on June 22, 2024

A sane architecture would be running your application on something that has at least the resources of a phone (e.g. 8+ GB RAM), in which case it should just buffer 100k connections without issue if that's what you need/want it to do. A sane application framework has connection pooling to the database built in, so the 100k requests would share ~16-32 connections and your developers never have to think about such things.

nijave · on June 26, 2024

You need to multiple that out by request servicing time.

Say your application uses 25ms real CPU time per request. That's 40 reqs/sec/cpu core. On a 4 core server, that's 160reqs/sec. That's 625 seconds to clear that backlog assuming a linear rate (it's probably sub linear unless you have good load shedding).

So that's 10 minutes to service 100k requests in your example. I'm ignoring any persistent storage (DB) since that would exist with our without Lambda so that would need its own design/architecture.

chuckadams · on June 22, 2024

I'd call pooling part of "the DB". "DB Layer" if you must, or the interface to it, whatever. Anyway, AWS has RDS Proxy, which held up pretty well against my (ad hoc and unscientific) load tests. But if you're actually trying to handle 100K DB requests in flight at once, your DB layer probably has some distributed architecture going on already.

ndriscoll · on June 22, 2024

If you're using RDS proxy, now you're not scaling to zero, and you still can't handle 100k burst requests because lambda can't do that. So why not use a normal application architecture which actually can handle bursts no problem and doesn't need a distributed database?

Lambda could be a compelling offering for many use cases if they made it so that you could set concurrency on each invocation. e.g. only spin up one invocation if you have fewer then 1k requests in flight, and let that invocation process them concurrently. But as long as it can only do 1 request at a time per invocation, it's just a vastly worse version of spinning up 1 process per request, which we moved away from because of how poorly it scales.

nijave · on June 26, 2024

Lambda can scale to 10k in 1 minute https://aws.amazon.com/blogs/aws/aws-lambda-functions-now-sc...

If your response time is 100ms, that's 100k requests in 1 minute.

Lambda runs your code in a VM that's kept hot so repeated invocations aren't launching processes. AWS is eating the cost of keeping the infra idle for you (arguably passing it on).

ndriscoll · on June 26, 2024

A normal application can scale to 10k concurrent requests as fast as they come in (i.e. a fraction of a second). Even at 16kB/request, that's a little over 160 MB of memory. That's the point: a socket and a couple objects per request scales way better than 1 process/request which scales way better than 1 VM per request, regardless of how hot you keep the VM.

Serving 10k concurrent connections/requests was an interesting problem in 1999. People were doing 10M on one server 10 years ago[0]. Lambda is traveling back in time 30 years.

[0] https://migratorydata.com/blog/migratorydata-solved-the-c10m...