HTTP is a stateless protocol. What would a stateful HTTP protocol look like?

wmf · on Feb 18, 2012

Authentication would only be done once. Likewise, User-Agent and Accept headers would only be sent during the handshake. Cookies would probably be handled differently. The current page URL could be treated like a cursor and the client could send relative URLs. Maybe WebSocket could be built in instead of being a separate protocol. These are all pretty minor things; in some sense HTTP wouldn't be HTTP if it had been designed stateful from the beginning.

evantahler · on Feb 18, 2012

This. When thinking about stateless protocols, I like to use the metaphor of a server-side session variable which can hold state for you to instantly access. Because the connection never closes, you can always access "my_last_command" or an array of previously set variables.

I make use of these notions in the actionHero api framework (as it is a framework for both HTTP and Raw TCP clients). I put down an example of a TCP session which may be of some use.

http://actionherojs.com/

dglassan · on Feb 18, 2012

Could you talk about that last sentence a little more? Is there a reason that HTTP was designed to be stateless rather than stateful?

emmelaich · on Feb 18, 2012

Generally speaking the lower levels of the stack should always do the simplest, least amount of work. One should not ask 'why not' but 'why' when designing protocols (imnsho); in other words, what is the correct layer to put session state.

kls · on Feb 18, 2012

Is there a reason that HTTP was designed to be stateless rather than stateful?

One of the original design goals of the internet was that it had the ability to survive loss of capacity, this was managed through lower level routing at the IP layer mainly, but HTTP took some of those goals and built the HTT protocol. So it became kind of an all or nothing protocol where all of the communication was packaged in a single transmit, it's actually a bunch of small packets independently transmitted with a manifest. But at a higher level it's is easier to conceptualize it as a single request/response communication. Much like a CB radio, one site talks and the other listens, then the other side talks and the other listens. This was done by chopping that communication up into packets and then routing them to and fro via the lower level routing I mentioned early. The big benefit of this was that it is fairly resilient to loss of nodes along the way, if a node is lost, the packets are retransmitted along the most efficient path available. It also makes horizontal scaling fairly easy if statelessness is adhered to at the higher level protocols and in the server architecture. Two HTTP server nodes, having the same data available can service request for each other transparent and the client is none the wiser. Generally speaking if an application adheres to the stateless architecture, it tends to have less complexity as it is not trying to fight against the underpinnings of the protocol, it capitalizes on the design of the system instead of constantly trying to compensate for making the architecture something that it was not.

Another core concept of HTTP in which statelessness helped significantly was address-ability, the goal was to have a platform where content and resources could be interlinked, so I could have one resource from one provider and another resource from another provider in a single page or hyperlinked. So disparate resources could be woven together in a web. This is the WWW but the WWW and HTTP are closely bolted together in their goals. HTTP was designed to bring the WWW into being. the WWW is the product of millions of disparate HTTP systems. Anyways, statelessness allowed for these systems to be loosely coupled with out the need for validation and sessions with each of these various provider system.

Server session and all of it's pitfalls are a good example of what HTTP would look like if it had been designed stateful. We would have issues with resources timing out all the time because they where having to trim resource utilization after time periods, given the design goal of surviving loss of nodes, it has to survive the loss of clients and there is no guarantee that the client will be able to notify you before said loss. So you are constantly guessing by inactivity and other less than accurate measures whether or not to kill the state of the client.

This is why I like where we are heading, I started doing web dev about 6 months or so after the WWW was invented. I followed it through CGI Post all the way to the modern era, and of all the things we bolted on, the one that never sat right with me was the concept of server session. I always thought the client should hold state and transmit that state to the server with inbound requests. The client maintaining state makes more sense because it is managing the least amount of resources and the state is important to the particular client and the state that that client is in. If it is lost, the server does not need to figure out how to deal with the resources it allocated for that client, because the resources are cleaned up with the natural request/response cycle. I personally like where we are heading with HTML/JS applications that request data via REST services, it feels to me like we are getting back to where the web was heading before server session and all of the server pages mess. Further REST extends the concept to make data and computational power addressable, which is turning the web into a distributed platform by many providers in which we build HTML/JS or iOS, etc. front-end workflows on top of.

davyjones · on Feb 18, 2012

A good example for comparison is FTP, which is stateful. For example, your login, file transfer modes, etc. are preserved between calls.

robgibbons · on Feb 18, 2012