HTTP is a huge, hefty, inefficient and complex protocol whose only advantage is that HTML/JS supports it by default. Arguments that 'websockets' solve this are ridiculous in the face of the fact that we can just use 'sockets', like we always have. WebSockets are a work-around for the constraints of the browser.
As a mobile/desktop/server engineer, I would love the opportunity to work with other server-side teams that aren't wedded to the web/HTTP via historical accident and thus don't force us to use HTTP.
HTTP implements an architectural style which ensures reliability, scalability, decoupling of systems and support for hypermedia for a complex network of disparate, unreliable systems and networks.
Do you have any suggestion that provides the same features, or should we forgo them because HTTP is "hefty"?
But do you have any concrete suggestions of protocols, or are you criticizing the choice based on an hypothetical protocol that would be very similar but incompatible with HTTP and all its existing tools (millions of tested and deployed caching servers, load balancers, etc), and for which whole new libraries would have to be written, just so you can make it somewhat more efficient?
I think you're grossly over-estimating the difficulty of defining a protocol. It's no more difficult than defining the protocol for which you'll use HTTP as transport.
Load balancers know how to load balance straight TCP. HTTP caching servers are an HTTP-centric idea.
The 'libraries' you'll need can be much, much smaller when all you need is a bit of framing and serialization, instead of a complete complex RFC compliant HTTP client stack.
I think you're grossly over-estimating the difficulty of defining a protocol.
It's not writing the protocol that I find the most difficult. It's reimplementing everything the uses the protocol.
Load balancers know how to load balance straight TCP.
Which is only useful if all the nodes are exactly the same, but that prevents you from distributing the data across them based on the user profiles, and then load balance according to the user id, as (if I'm not mistaken) Netflix does. Since they're using subdomains as user identifiers, you'd get that for free using an existing, well-tested HTTP load balancer.
HTTP caching servers are an HTTP-centric idea.
That's a tautology. The question is: are they a useful idea? Is being able to take advantage of existing and deployed solutions like CDNs useful? Seems to me like it would be.
The 'libraries' you'll need can be much, much smaller when all you need is a bit of framing and serialization, instead of a complete complex RFC compliant HTTP client stack.
I think you underestimate the advantages that some of the core HTTP concepts provide.
> Which is only useful if all the nodes are exactly the same, but that prevents you from distributing the data across them based on the user profiles, and then load balance according to the user id, as (if I'm not mistaken) Netflix does. Since they're using subdomains as user identifiers, you'd get that for free using an existing, well-tested HTTP load balancer.
I'm not sure what you think makes that complicated to implement without HTTP, or why you consider it 'free'. Netflix had to write custom code to support that, and could have just as easily done so on top of a message passing architecture ala ZeroMQ or even AMQP.
> That's a tautology. The question is: are they a useful idea? Is being able to take advantage of existing and deployed solutions like CDNs useful? Seems to me like it would be.
Not really, no -- neither a tautology nor are they particularly useful for API implementation. Their primary value is in caching resources for HTTP requests in a way that meshes well with the complexity of HTTP.
If you need geographically distributed resource distribution than HTTP may be a good idea simply because:
- There's widespread standardized support for HTTP resource distribution.
- Its inefficiencies are easily outweighed by the simple transit costs of a large file transfer.
We're largely talking about server "API", however.
> I think you underestimate the advantages that some of the core HTTP concepts provide.
No, the core concepts are more-or-less fine. It's the stack that's inefficient and grossly complex, largely due to browser constraints and historical limitations.
I'm not sure what you think makes that complicated to implement without HTTP, or why you consider it 'free'. Netflix had to write custom code to support that, and could have just as easily done so on top of a message passing architecture ala ZeroMQ or even AMQP.
It's free because it already exists. Load balancers for hypothetical protocols don't.
Not really, no -- neither a tautology nor are they particularly useful for API implementation. Their primary value is in caching resources for HTTP requests in a way that meshes well with the complexity of HTTP.
If you need geographically distributed resource distribution than HTTP may be a good idea simply because:
- There's widespread standardized support for HTTP resource distribution.
- Its inefficiencies are easily outweighed by the simple transit costs of a large file transfer.
We're largely talking about server "API", however.
Isn't the whole point of this system to transfer people's content - posts, pictures, videos, etc - between servers? I would think pure API "calls" would be a small part of the whole traffic.
No, the core concepts are more-or-less fine. It's the stack that's inefficient and grossly complex, largely due to browser constraints and historical limitations.
But to implement them, you need more than "a bit of framing and serialization".
> But to implement them, you need more than "a bit of framing and serialization".
I posit you're still grossly overestimating complexity based on your own experience with HTTP, coupled with grossly underestimating the complexity, time costs, and efficiency costs of the stack HTTP weds you to.
A TCP stream is simple. It's as simple as it gets. Load balancing it requires a few hundred lines of code, at a maximum. It only gets complicated when you start layering on a protocol stack that is targeted at web browsers, grew over the past 20 years, requires all sorts of hoop-jumping for efficiency (keep-alive, websockets, long-polling), requires a slew of text parsing and escaping (percent-escapes, URL encoding, base64 HTTP basic auth, OAuth, ...), cookies, MIME parsing/encoding, so on and so forth.
All this complexity is targeted at web browsers, introduces significant inefficiencies, and requires huge libraries to make it accessible to application/server engineers.
What's the gain? Nothing other than familiarity, as evidenced by your belief that the core of what HTTP provides is so incredibly complicated, and you couldn't possibly replace it.
No -- it's the complexity of HTTP that's complicated, not the concepts that underly it. Drop the HTTP legacy and things get a heckuvalot simpler.
It's not so much that you can't replace HTTP, as that you can't replace all the thousands of tools and packages that already work with HTTP, and that can be very useful for a project like this. And you can't easily replace the knowledge that people have to HTTP either. (Claiming that OAuth is part of HTTP doesn't help with your credibility either, I'm afraid.)
Furthermore, I think that even if the developers of this project could replace the required tools and forgo the rest, I doubt it'd make sense.
Frankly, you'd need a working prototype to convince me of the contrary, so I guess we'll have to leave it at that. I'm a stubborn man ;)
While iMatix was the original designer of AMQP and has invested hugely
in that protocol, we believe it is fundamentally flawed, and
unfixable. It is too complex and the barriers to participation are
massive. We do not believe that it's in the best interest of our
customers and users to invest further in AMQP. Specifically, iMatix
will be stepping out of the AMQP workgroup and will not be supporting
AMQP/1.0 when that emerges, if it ever emerges.
By the way, the AMQP spec is roughly the same size as the HTTP spec, and the latter spends a lot of pages listing just status codes.
And of course, AMQP uses a model based on Sessions, which is great if the components of the system are static, but not that great if you're talking to a lot of nodes who come and go, since you'll end up with uneven load distribution on your servers.
Regardless of HTTP as a particular implementation, I think statelessness makes perfect sense in a unreliable network of nodes.
WebSockets also have the advantage that they pass through corporate firewalls and open wifi networks, as well as many proxies, as they masquerade as HTTP traffic. And a nicer frames mechanic than raw socket, something I love. (nowhere near as low-level, but for me it's essentially stateful UDP that's reliable, i.e. TCP except with datagrams)
Corporate firewalls are the general boogieman, but in reality, I haven't seen evidence that they're much more than that.
To test this, we implemented fallback-to-HTTPS behavior in a very widely used previously non-HTTP client. We then observed the number of clients that failed to connect via our custom protocol, but succeeded in falling back to HTTPS.
The numbers were negligible.
It's ridiculous that we'd seriously believe that we can't trust that TCP works on the internet. We joke about it being the "interweb", but I see no reason to sow fear, uncertainty, and doubt, and thus and actually turn the interweb into reality.
I believe his point is that you can generally carry whatever protocol you want over port 443 (and often port 80).
Given how many other things are broken by networks that foolishly only open port 80 and 443, and their (in my experience) relative rarity, I'd suggest that it's not worth bothering with, except possibly as a fall-back to measure the actual number of people trying to use your service behind such a network.
I think 443 will be a better example because (I think) it's harder for a middle party to profile https and see that it is indeed https and not something else.
As a mobile/desktop/server engineer, I would love the opportunity to work with other server-side teams that aren't wedded to the web/HTTP via historical accident and thus don't force us to use HTTP.