> An operation is idempotent if performing it multiple times yields the same result as performing it exactly once.
And then he casually offers an API that does different things on first and second call as the "good" example. If you have a "create a virtual machine" API it better create a fucking virtual machine. If I call the damn thing twice, I expect to have two VMs. If there is some sort of unique argument like create a named VM, I would expect API to throw an error if the name is already taken, no to just return like everything is normal.
And this guy is being all snarky about API design?
I agree with the author re: returning success on retries; it lets you automate the retry process.
I work in mobile games; because someone might play in a tunnel or bad network area, I need to make sure that every request is retry-able.
To do that I generally include a GUID of some kind in the request; if the client says "create an entry for XXYY," there's a chance that the request will get to the server but the response will fail to reach the client.
If the client is able to retry the request (with the same GUID) and get a success response, then I can have the retries handled transparently in the communication layer; all the client code needs to know is "I made this request and it was a success," without any knowledge of how many tries it took.
If the second/third/etc request returned an error of some kind, I wouldn't have a good "success" response to hand back to the game code. (I'm assuming the "success" response contains some information that the game code needs.)
What happens if some other process has already created a VM with this name and spec? Under most realistic scenarios I would rather VM creation failed than silently clobber someone else's VM.
This nonidempotent API is harder to use. Someone that doesn't know about these error codes or the fact that the API isn't idempotent will write code without the try-catch blocks that doesn't handle retries correctly. With the idempotent API, users fall into the pit of success where things just work without them having to know the details about each of the edge cases.
The nonidempotent API is exposing some extra data to the user, but it's not super useful. You basically always want to treat the vm_already_exists error identically to a success response. Maybe you also want to log some data about how many retries were necessary so you can figure out how spotty the network connection is, but there's no reason that couldn't work with the idempotent API either. The idempotent API could include a header about whether the action was already taken previously.
Consider how TCP connections are used by applications. Your application doesn't have to opt in to handling packets that were resent. The fact that some packets had to be resent is by default just an implementation detail. You have to opt in to get information about the resent packets; by default they're handled like regular successful packets. Idempotent APIs are about making handling retries work by default in a very similar way.
Lets start simple, your example assumes that you generate the id yourself. In my experience a common API usage pattern would look more like
try:
vm_id = api.make_vm()
except SomeError as e:
log.error(e)
else:
res = api.do_thing_with_vm(vm_id)
and in your example, if we are generating ids ourselves, we still have to verify that we got the right VM. If your ids are provably unique, there is no reason to generate them, the API can take care of that, but if you want something like a named entity, you have a problem. What if the name is already taken? So your code would look more like
new_id = generate_id()
try:
vm = api.get_vm(new_id)
except VM_DoesNotExist:
vm = api.make_vm(new_id)
except SomeError as e:
log.error(e)
else:
api.do_thing_with_vm(new_id)
because if the make_vm API simply returns a VM whether it was created or not, it is entirely possible that you are getting a VM that is busy doing something else for some other process.
And then he casually offers an API that does different things on first and second call as the "good" example. If you have a "create a virtual machine" API it better create a fucking virtual machine. If I call the damn thing twice, I expect to have two VMs. If there is some sort of unique argument like create a named VM, I would expect API to throw an error if the name is already taken, no to just return like everything is normal.
And this guy is being all snarky about API design?