Today I learned the modern browser has even more caches than I thought it did.
Actually, good web engineer interview question might be how many caches there are in a web request. A good interview could easily end up with 20 on the whiteboard.
Just a guess on my part here, () may or may not exist
. This is not a perfectly exact representation of the system, as in reality many steps here are recursively recalled, and for most steps the CPU and disk phases occur. I've certainly missed many.
( Application DNS cache) -> ( libc DNS cache ) -> ( gateway DNS cache ) -> ( DNS caching resolver cache ) -> ( DNS nameserver authoritative cache ) -> Filesystem Cache -> Hard disk Device Cache -> ( JavaScript JIT cache ) -> HTTP cache -> HTTP push cache -> HTTP Preload cache -> CPU L1 -> CPU L2 -> (CPU L3) -> ( HTTP proxy cache ) -> (N hops invoking many of these steps N times) -> ( HTTP router/relay cache ) -> ( HTTP Reverse Proxy cache ) -> HTTP Server request caching.
Because on some hops almost all of them are invoked for a request, we aren't looking at 20 caches, we are looking at hundreds.
The one thing that keeps on bugging me about HTTP/2 Push is that it ignores/sidesteps all the usual cache logic.
If you are on mobile with limited GB (or expensive GBs) as it's always the case, push can cause some heavy resources to be sent to you even after you have them in the local cache.
The browser can cancel the stream, but once the stream is already arriving.
I am aware that there are cookie-based hacks, but... I really don't understand why this didn't come up as a possible problem earlier during SPDY test etc.
Cache digest is a proper solution that will soon be standardized, it's not a cookie-based hack. If you need it today, you can implement it by yourself, just take any existing Bloom filter implementation.
You can just not push optional resources. Or push the smallest responsive asset and let the full-scale one come in if requested. Just because it doesn't handle every edge case doesn't make it a failure. Being able to push required dependencies without a round trip is a huge boon for those same mobile users that might have restricted bandwidth.
If the mobile user already has those resources cached, it ends up using more bandwidth. It's possible implement checks for this, but it can be tricky to get right for all cases.
On the other hand, the browser's HTTP cache is well understood and standardised, and just requires sending the correct response headers (at the cost of the additional round trip).
The spec gives the browser the ability to cancel a push in the case that it already has the resource cached. Testing that was one of the pieces of the linked article, because not all of the browsers implement that yet and have somewhat different behaviors.
Unless the user is using Safari, in which case it'll likely waste bandwidth.
Also, I don't think servers give us the correct priority control to replace inlining. As things currently stand, if you spend bandwidth sending the body of a page over pushing the critical CSS, you'll be slower than inlining.
Good job the post wasn't titled "The browser bugs and edge cases of HTTP/2 push are totally the problem browsers have today. Not the damn CPU usage, not the enormous RAM consumption"
Actually, good web engineer interview question might be how many caches there are in a web request. A good interview could easily end up with 20 on the whiteboard.