If you actually intend to serve traffic back to the original requestor, you can't have too many requests handled per server because you have a limited amount of bandwidth. 150 requests per server on a 1Gbps pipe (which is what you get on, for instance, a single ELB instance) only actually gives you about 813 KB/s per request for 150 requests/server. For 10,000 requests you'd be down to 12 KB/s per request. For comparison, the average web page size is now up to 2048 KB!
To be clear, you can do a lot better with a better pipe, smart caching, compression, etc. But people often have horribly unrealistic estimates about how much traffic their servers can handle because they don't take bandwidth into account, and load balancers are no exception.
Of course, when you break it down by individual web request, most responses are still below 800KB, but you shouldn't load plan for the average case. And clearly even the average case is well above 12KB, especially for a CDN (which is responsible for serving the image, video, and large script content). I'm also pretty confident the page I linked already includes compression (which decreases size, but can increase time quite a bit; many people expect software load balancers to be using the absolute fastest compression available, but that's often not the case in my experience).
Sure, if you actually set the headers correctly and the browser actually retains them (recall that browsers are free to discard their caches at any time). And, if they can't be cached on the user's computer, who caches them? That's right, the CDNs. Like fast.ly. Who wrote the article.
To be clear, you can do a lot better with a better pipe, smart caching, compression, etc. But people often have horribly unrealistic estimates about how much traffic their servers can handle because they don't take bandwidth into account, and load balancers are no exception.