> QUERY requests are both safe and idempotent with regards to the resource identified by the request URI. That is, QUERY requests do not alter the state of the targeted resource. However, while processing a QUERY request, a server can be expected to allocate computing and memory resources or even create additional HTTP resources through which the response can be retrieved.
The possible creation of extra HTTP resources (response resorces?) seems to me contrary to idempotency. That seems more like the territory of POST.
If two identical QUERY requests might produce different response resources, how to square that with the the fact that QUERY will be cacheable?
> The possible creation of extra HTTP resources (response resorces?) seems to me contrary to idempotency.
If two repetitions of a QUERY request create the same extra HTTP resource(s), then it can be idempotent.
Idempotent means you can't tell the difference between 1 or N requests, not that you can't tell the difference between 0 and 1. Think about PUT, which is also idempotent.
GET is (and all methods, including all safe and idempotent methods, are) allowed to have side effects, per the spec. Safe and idempotent are not mathematical constructs as defined in HTTP, they are more “business” constructs.
I know what you mean. It feels like we're missing an idea of scope for resources. If there was some kind of transaction scope or session scope or something, then a QUERY could create resources within that scope, so we could know that in the long run, it has no side effects. But that would be antithetical to the idea of statelessness perhaps.
Or maybe we just need distributed garbage collection for URLs.
> GET is allowed to have side effects, just not beyond the first invocation of a given request.
GET can have side effects, and has no difference first and subsequent invocations (because it is safe as well as idempotent). Were it idempotent but not safe, it could have side effects that the client was accountable for the first request, but no different ones of that kind for subsequent uses.
The way I look at it is that the system must continue to meet its requirements (whatever they might be) whether it gets one GET request or many in response to a single action within the user agent (clicking a link, submitting a form, script making a request, etc.). In general, logging two requests instead of one does not violate any requirements and in fact logging every request, even duplicates, is the expected behavior. Adding the same item to a list twice in response to a single UI interaction, on the other hand, would not give the desired effect.
> The possible creation of extra HTTP resources (response resorces?) seems to me contrary to idempotency.
A GET request might create additional (or modify existing) resources, say if the API exposed it's own log via HTTP.
Both safe and idempotent are less expensive than one might naively think in the HTTP spec (which is good, because the naive understanding, while aesthetically seductive, isn't very practical at all.)
Some quotes from the relevant bits of RFC 7231:
“This definition of safe methods does not prevent an implementation from including behavior that is potentially harmful, that is not entirely read-only, or that causes side effects while invoking a safe method. What is important, however, is that the client did not request that additional behavior and cannot be held accountable for it.”
“The purpose of distinguishing between safe and unsafe methods is to allow automated retrieval processes (spiders) and cache performance optimization (pre-fetching) to work without fear of causing harm. In addition, it allows a user agent to apply appropriate constraints on the automated use of unsafe methods when processing potentially untrusted content.”
“Like the definition of safe, the idempotent property only applies to what has been requested by the user; a server is free to log each request separately, retain a revision control history, or implement other non-idempotent side effects for each idempotent request.”
“Idempotent methods are distinguished because the request can be repeated automatically if a communication failure occurs before the client is able to read the server's response.”
> In HTTP idempotent is where the state of the server remains unchanged
No, it's not. That's closer to “safe” than “idempotent” (safe also implies idempotent, but not the other way around), but even then it is not quite right, because even safe methods are allowed to have side effects, but their is guidance about the kind and impact of side effects that it shouldn't have.
So that means that, without a cache, repeating a QUERY might create two response resources but, with a cache, only one will be created. I find that odd. My understanding of HTTP idempotency is that it's more of a "whole-server" concept (excepting perhaps things like creation of log entries and metrics). Always creating a new resource for each request seems contrary to that.
A way to square creation of response resources with idempotency could be: the second identical QUERY that arrives should always reuse the result resource created by the first QUERY.
If I QUERY the current price of a stock, and then someone else sends an identical QUERY ten seconds later, they might get a different result. This is not because QUERY isn't idempotent.
I think that, when talking about idempotency, there's the implicit assumption that the "rest of the world" stays the same while the sequence of operations is performed.
rfc2616 says:
> Methods can also have the property of "idempotence" in that (aside
from error or expiration issues) the side-effects of N > 0 identical
requests is the same as for a single request.
Idempotency is not about "you get the same result", it's about the effects of your http request on the server. Notice that the definition you quoted is in terms of side-effects, not results.
If a request changes the state of the server and another identical request changes the state of the server in a different way, it's not idempotent.
If a request doesn't change the state of the server at all it is idempotent, even if subsequent requests might get different responses (e.g. the stock quote example in my previous post).
If a request changes the state of the server but repeated identical requests don't have any different effect it is also idempotent. For example, DELETE is idempotent because DELETE-ing something N times is the same as deleting it one time.
I think this is pointing to the problem with your definition of 'idempotent'. Idempotency simply means that any number of additional identical requests will have the same effect on the state of the resource, not that they will have no effect. (And by 'have the same effect', we mean 'produce the same state', not 'alter state in the same way' - effects are algebraic projections.)
That's why it's called idempotent - 'doing the same' - rather than impotent.
As I read it I think that the idea there is to allow usage of pattern where the resulting resource refers to other resources that somehow encode the contents of the QUERY request body in their URL (or even results in redirect to such resource). For example the result of QUERY is page with html table of the data which also includes server-side rendered chart of the same data as an external image.
[Edit: the return redirect to URL that somehow encodes the query usage is even given as an example in section 4.2]
> idempotent with regards to the resource identified by the request URI
That means that a QUERY request can change the state of the server, for example by creating new resources; there's exactly one resource it's not allowed to change.
That has always been the case ... requests get logged, and if the server exposes its access logs over HTTP, that's one thing for which a request won't be idempotent
Idempotent etc in the HTTP specs has always been more or less an attempt at a promise to the client "you should be able to repeat this request if you're not sure about success/failure without anyone claiming to implement HTTP being able to throw the book at you".
A resource is defined by a path, so if you have a `QUERY /documents` or `QUERY /albums` endpoint, the resource is all documents or albums that you are searching across, so it cannot add one of those items (like `POST /album`). It is possible that this could affect some other resource (e.g. an audit trail), which would mean that a `QUERY /logs/audit` endpoint must not add an audit log entry per the idempotent requirement.
Hum... You are complaining about a request having the side effect that a server may fork another process to answer it? That's not really much anybody can do about this.
The possible creation of extra HTTP resources (response resorces?) seems to me contrary to idempotency. That seems more like the territory of POST.
If two identical QUERY requests might produce different response resources, how to square that with the the fact that QUERY will be cacheable?