A common solution to this problem, is to make a 2 stage process, where step 1 is a request of "should I download?", where there are 2 possible replies: "no, check again in N time" and "yes, here is a token". Step 2 is then presenting the token to the api point for download, and getting the file.
On the server side, you don't even need specific instance tracking, just a simple decision based on current resource usage, and a list of valid tokens (optionally, they can expire in some short time to avoid other thundering herd type issues). Say, you set a max number of file transfers, or bandwidth or whatever metric makes sense to you, and you simply reply based on that metric. Further, you can smooth out your load with a bit of intelligence on setting N.
Even better, you get a cool side-effect: since the check isn't so resource intensive, you can set the time between checks lower, and make the updates less regular.
Now that I think of it: it seems that this would be a nice nginx plugin, with a simple client side library to handle it for reference. Anyone want to collaborate on this over the weekend? Should be relatively straight-forward.
> A common solution to this problem, is to make a 2 stage process, where step 1 is a request of "should I download?", where there are 2 possible replies: "no, check again in N time" and "yes, here is a token". Step 2 is then presenting the token to the api point for download, and getting the file.
You don't even need two steps, just have one step with previously known data. That's how HTTP conditional requests (Last-Modified/If-Modified-Since and ETag/If-None-Match) work: the client states "I want this file, I already have one from such moment with such metadata", and the server replies either "you're good" (304) or "here's your file (200).
Issue is, that only works when the file changes rarely enough, or you need additional server logic to reply that the file is still good when it's not.
> Now that I think of it: it seems that this would be a nice nginx plugin, with a simple client side library to handle it for reference. Anyone want to collaborate on this over the weekend?
I'd be very surprised if nginx didn't support conditional requests already.
edit: according to [0] and [1] — which may be outdated — Nginx provides built-in support for last-modified on static files, it does not provide ETag support (the developer believes this is not useful for static files — which is usually correct[2]) but [1] has apparently written a module to do so [3]. The module being 4 years old, it might be way out of date.
[2] There are two situations in which it is not (keep in mind that this is for static content, dynamic is very different): if somebody willfully touches a file, it will change its Last-Modified but not its checksum, triggering a new send without ETag but not with it; and ETags can be coherent across servers (even in CDNs), the chances of last-modified being exactly the same on all your servers is far smaller.
On the other hand, no etag is better than a shitty etag, and both Apache and IIS generate dreadful etags — which may hinder more than help — by default.
Yes, this work for cache updating, and it is fantastic for that purpose. It does not solve the actual stated problem, which is that periodic checks in an attempt to smooth server loading away from peaks don't usually drift towards extremely bursty behavior. When the file does change, you still get a large number of clients trying to download the new content all at once. The solution I was suggesting is similar to what you are talking about, but also has the feature of smoothing the load curves.
Issue is, that only works when the file changes rarely enough, or you need additional server logic to reply that the file is still good when it's not.
My algorithm is that logic -- albeit implemented with client side collusion rather than pure server side trickery (this allows better control should the client ignore the etags).
> The solution I was suggesting is similar to what you are talking about, but also has the feature of smoothing the load curves.
It has no more feature of smoothing the load curve than using Cache-Control with the right max-age.
> My algorithm is that logic
It is no more that logic than doing what I outlined with proprietary behaviors.
> this allows better control should the client ignore the etags
by making the whole client use a custom communication channel? I'd expect ensuring the client correctly speaks HTTP would be easier than implementing a custom client from scratch.
You still seem to be missing the point. Cache-Control as implemented commonly, and by your description, will instantly serve every request the new file as soon as a new file is available. It takes into account exactly one variable: file age.
The algorithm I describe takes into account variables which affect current system loading, and returns a "no, try again later", even when the file is actually different, because the server is trying to conserve some resource (usually in such cases it is bandwidth). Like I said, this can be done with etags, but a more explicit form of control is nicer. Which brings us to this:
> this allows better control should the client ignore the etags
by making the whole client use a custom communication channel? I'd expect ensuring the client correctly speaks HTTP would be easier than implementing a custom client from scratch.
A client speaking proper http would be perfect for this. So point your http client to:
domain.com/getlatest
if there is a token available, respond with a:
307 domain.com/reallatest?token=foo
If no token is available and no if-modified headers are sent, reply with:
503 + Retry-After N
if there is not a token available, and the requestor supplied approrpiate if modified headers respond with a:
304 + cache control for some scheduled time in the future (which the client can ignore or not)
Of course that last condition is strictly optional and not really required, since then it would be abusing cache control, rather than the using 503 as intended.
(also note, a request to domain.com/reallatest with an invalid token or no token could result in a 302 to /getlatest or a 403, or some other form of denial, depending on the specifics of the application).
edit: Strictly speaking, the multiple url scheme above isn't even needed, just a smart responder associated with the 503 is needed, however the url redirect method above was there because there may be a larger application context around system, in which getlatest does more than just serve the file, or in which multiple urls would redirect to reallatest, both easily imaginable situations.
> If no token is available and no if-modified headers are sent, reply with:
> 503 + Retry-After N
That's cool. There's still no reason for the second url and the 307, and you're still getting hit with requests so you're not avoiding the request load, only the download. You're smoothing out bandwidth, but not CPU & sockets.
This is sort of true. I don't know of a way to simply limit the number of incoming sockets without getting a lot of ISP level involvement or just outright rejecting connections. It does limit the number of long-lived sockets for file transfer. On static file serves, I am assuming the cpu has plenty of spare capacity for doing the algorithm, so I am not worried about that. Finally I am assuming the limiting factor is bandwidth here, so bandwidth smoothing the main goal.
On the server side, you don't even need specific instance tracking, just a simple decision based on current resource usage, and a list of valid tokens (optionally, they can expire in some short time to avoid other thundering herd type issues). Say, you set a max number of file transfers, or bandwidth or whatever metric makes sense to you, and you simply reply based on that metric. Further, you can smooth out your load with a bit of intelligence on setting N.
Even better, you get a cool side-effect: since the check isn't so resource intensive, you can set the time between checks lower, and make the updates less regular.
Now that I think of it: it seems that this would be a nice nginx plugin, with a simple client side library to handle it for reference. Anyone want to collaborate on this over the weekend? Should be relatively straight-forward.