controlling the producer is such a hard problem, even with exponential back off ...

controlling the producer is such a hard problem, even with exponential back off and backoff times in the response headers, you still get at minimum 2x throughput increase from the producers during a retry storm

problem is that the most common backpressure techniques like exponential back-off and sending a retry-after time in the response header have constraints on maximum backoff time they can do, in some scenarios that is much much less than the normal.

for example, imagine a scenario where a customer explores 10 items on Amazon, and then finally places an order, so 10rps for the product page and 1 rps for the order page. if order services goes down, slowly the customers get stuck on the order page and even with backpressure, your RPS keeps on growing on the order page. exponential backoff doesn't help as well

while dropping requests is a good idea, but that action is not designed by default every time, systems go into metastable state and you need the ability to control the throughput on producer side

you could solve it by keeping a different layer in between like load balancer or some gateway layer that is resilient against such throughput spikes and will let you control throughput on your service and slowly scale up the throughput as per your requirements (by user or by random)

for frontend devices, it gets exponentially harder to control the throughput. having an independent config API that can control the throughput is the best solution that I came across