The distinction that it has to look like a normal function call is unhelpful. To...

The distinction that it has to look like a normal function call is unhelpful. To me RPC and IPC are synonymous (which I get isn’t the Wikipedia definition). I lump them all together as “distributed communication”. The actual problems I think about are:

* semantics of what the request / response looks like and how failures communicate

* how are the APIs composed? Not the local function call, but the functionality calling the API implies (see cap’n’proto’s page on composable remote APIs - it’s really brilliant).

* can you communicate across programming languages?

* how is ownership of remote resources managed?

* how is routing / dispatch of API calls managed?

For example “fetch” in JS land looks a hell of a lot to me like a normal function call to me. And I certainly then wrapped it with easier to read APIs that looked more normal and handled routing and error handling for me. The main difference with HTTP is that RPC systems traditionally autogenerate a bunch of code for you. But in HTTP land you now have efforts like OpenAPI to do similar things. So HTTP isn’t RPC but the autogenerated code from a description of the API makes it RPC? That gets you into the ship of Theseus paradox which isn’t helpful. At what amount of wrapping it with local functional calls does distributed API calls transform into RPC?

To me it turns out that async / sync is a key distinction and the original sin of RPC in the 80s. It’s also why I view the coloring problem as a good thing (at least given where the tech stack is today). Making functions that may be doing I/O stand out and needing thought is a good thing. The problem with many RPC/IPC mechanisms from the 80s is they tried to make the code seem synchronous not that it looks like a function call. But I haven’t seen any RPC systems where async vs sync is the distinguishing characteristic (eg cap’n’proto and fetch are both async).

To me, HTTP, REST, SOAP, COM, gRPC, cap’n’proto, JSON-RPC, XPC, Mach etc all try to solve the same problem of distributed communication and processing, just by making different trade offs.

HTTP tried to standardize RPC semantics but you can clearly see it’s optimized particularly around static document retrieval and typical browser tasks. I’ll buy that static document retrieval need not be classified as a remote API but there’s a lot of non static document retrieval that’s layered on the same semantics that is proper API calls (notably REST) tries to get everything to conform to that so that middleware has some convention and you get browser interop. It works surprisingly well because a good majority of apps are CRUD and don’t need anything more.

JSON-RPC tries to solve routing/dispatch to be more automated if I recall correctly because REST is such a cluster (look at all the routing libraries which to me are an anti pattern). + JSON-RPC is particularly easy in JS which is usually the language on one side of the transaction.

COM solves all of the above and things like distributed ownership of resources, except it historically didn’t have a good security model (I don’t know it well enough to know what’s improved there but I don’t assume it’s stayed static since the 90s even if it hasn’t found traction outside Microsoft).

They all try to solve different parts of the problem or have different sets of tradeoffs, but to me it’s fundamentally the same problem. It’s the same reason I don’t view distributed systems as something that requires different machines, CPUs, processes or anything about geography. You can have distributed systems within one process, you can have it within processes on the same CPU, you can have it between CPUs on the same device, etc. Heck, to me even the kernel and user space form a distributed system with syscalls as a primitive RPC mechanism. JNI is a form of RPC too although you’ve seen traditional systems like gRPC supplant it even within process because it moves complexity out of the JNI layer into a simpler system with easier to follow rules.

Now you can sometimes makes simplifying assumptions to reduce complexity, reduce power usage etc. and not all distributed systems necessarily have the same sets of problems. But fundamentally to me it’s all the same problem space and why I don’t distinguish the terminology so much.