It feels to me like the author used WebSockets for the wrong thing. I don't think WebSockets belong on a website (even an interactive one as discourse, it is still a website based on my criteria). They'd be more apt in web applications.
So, to me it feels like the 1-connection-per-tab feels like saying "someone launched 50 instances of an executable and it opened 50 TCP connections".
Examples of where I think it's acceptable/recommended to use Web Sockets include /(video|audio)?/ chats, multiplayer games, collaboration tools, etc. Stuff that requires real-time notifications.
Does a discussion platform require that time precision? I doubt it (frankly, I think WebSockets would give a too high "temporal resolution" unless there were proper delays in place which just complicates the implementation, but that is actually irrelevant here).
The author is indeed well aware of WebSockets limitations, and OP's "Examples of where it's acceptable/recommended to use WebSockets" seem sound.
The problem is that the appeal of oooh-shiny is so strong that some (many?) developers and influencers do end up "using WebSockets for the wrong thing". Last case I met: a tutorial for a client+server js app using React & Redux [1], building... a CRUD voting application. Quoting the article on why choosing WebSockets for this, "This is an app that benefits from real-time communication, since it's fun for voters to see their own actions and those of others immediately when they occur. For that reason, we're going to use WebSockets to communicate."
Well yes, WebSockets work here (modulo problems and missed benefits/tooling that are the whole point of this article), and socket.io seems like a nice library, but a good old long poll would have done just as well :)
In general long polling takes more browser/network resources than web sockets. “Long polling” means you keep one request always open to the server, and whenever it is answered or times out, you immediately open a new connection.
cf. https://en.wikipedia.org/w/index.php?title=Comet_(programmin... for an explanation of various ways of implementing “comet” / “push” web applications. In my opinion anyone who would consider using long polling should default to using WebSockets, with some XHR streaming implementation as a backup option, and long polling as a last resort fallback when other methods don’t work. Long polling is strictly inferior to WebSockets for modern browsers. [I’m linking to the old version of the Wikipedia page because shortly after that version a rogue editor with an axe to grind came and obliterated most of the content, and the article never really recovered.]
Maybe you just meant “polling”, with some very low poll frequency, or perhaps some kind of exponential backoff? In my experience, websites which use polling usually also take more browser resources than WebSockets would take. Almost nobody bothers to back their polling rate off to once every few minutes or slower when they are in a background window/tab, because web developers have little to no incentive to care about client-side resource use, so as a rule they don’t. If your site is polling every few seconds, that ends up being terrible for browser performance once there are many tabs open.
More properly, web applications which expect they might be opened in multiple tabs (looking at you Facebook, Gmail, etc.) should just be opening a single WebSocket connection, and then communicating among themselves across tabs to pass data where it needs to go. It’s really annoying that lazy web developers just open up several new persistent connections per tab, wasting sockets / bandwidth / battery life / etc.
No I meant long-polling exactly as you define it; periodic polling doesn't seem comparable to WebSockets, due to the polling frequency.
Let me reformulate: as you say, yes WebSockets are probably better in every way than longpoll in modern browsers. But due to the problems with WS highlighted in the articles, for applications that do not require real-time communication, I'd rather stick with a slightly slower longpoll.
Regarding resource usage greater with WS than longpoll, can you precise that?
Long polling creates a new connection for each polling, and keeping the connection open anyway for a given length of time... the connection will still be idle for a while before it drops off, depending on your OS configuration. Whereas websockets keep the single connection open, and with the right backend there's VERY little overhead to actually doing this.
Current async platforms (node, go, etc) have been shown to be able to keep a million+ open connections on a moderate server... Node in particular was geared towards a target of 100k per cpu, iirc. Which it now greatly exceeds. There are of course other application complexities that can reduce the connections per server, but that's another issue.
One does get the impression that the problems with websockets are mostly much extremely large web companies complaining that it isn't perfectly ideal for their exact application, and that they'd rather have their exact usecase written into browsers.
This seems like a terrible fit for WS. It's not a realtime problem, and doesn't require a realtime solution. Near-realtime will do just fine. I.e. have some endpoint that gives you voting results, and if requested with the text/event-stream media type will give you a stream of realtime events. Then post votes and what not with usual HTTP methods. This is presumably less efficient in terms of transport, but much more straight forward (and thus more cost-efficient) to implement and has some serendipitous side-effects such as your API now being usable by any ol' HTTP client.
Edit: to be clear, I'm agreeing with your assessment; and criticizing the article you linked to. :o)
If you are getting the results as an event stream, that doesn’t really save anything on client-side resources. Seems just as “real time”, just with a slightly different back-end implementation.
The difference is on the client-to-server communication, since it's a one-way stream. So you're right in saying it's just as realtime when getting results from the server, but there's added latency to each request being made from the client. My point is that this latency is probably negligible for most "realtime" applications. The benefit to all this is that you're not re-inventing protocols on top of a low-level transport, and both front- and backend becomes significantly simpler (and thus more cost effective) to implement.
I'm not understanding your distinction between 'websites' and 'web applications'(we can have a debate about the distinction). All the issues that the author raised apply to both categoties when WebSockets are used. So in context, it's a meaningless distinction. And even if you are writing just a 'website' - why shouldn't you use any standard tool available? It's like saying CSS3 is overkill for 'websites'.
At work we have a true 'web application' and opted for long polling for real-time collab for a variety of reasons, one of which is that long poll plays nice with corporate firewalls and WebSockets are a mixed bag.
I suppose inherent in the design of many modern 'web applications' is the concept of a single page interface. This encourages user behavior through a single view, rather than how many people interact with a website where they might spawn off dozens of tabs for each piece of content they intend to consume. In the case of the later, you'd wind up with dozens of tabs each conducting some form of communication, whereas with a web application, the design of the app might result in fewer connections since they're being routed through an interface designed to manage the flow of user interaction. Certainly not the case for every website or every web application, but I think that's conceptually why the distinction was being made.
It's like @uptown described it. It's not so much about what technology is used, but the overall design, work flow, and the user requirements of the Web app.
Let's take Facebook.com for example, feels more like a website, while its Messenger.com website does feel more like a web app.
1. Ditch Ruby. Node or Python with asyncio work fantastic with WebSockets. Use a greenlet-enabled database connection pool.
2. Use HTTP for traditional request/response tasks. Use WebSockets to push realtime notifications to clients.
3. Have your clients auto-retry their WebSocket connections in the background. Show an indicator of degraded service when this happens. When a new WebSocket connects, blast a complete state update to it to take care of any missed messages. Now you can easily load-balance your clients.
Yes, yes, and yes, you nailed it. The author of the article seems to have difficulty separating "problems specific to WebSockets" and "specific problems I have."
People who say "REST is dead, AJAX is dead, HTTP headers are dead, all hail the new sockets" are wide-eyed fanatics and this article is a useful corrective to them, but it's definitely not an even-handed analysis.
There are too many problems with the article to do a line-by-line rebuttal, but the one that was most silly for me was "what if a user has multiple tabs open, oh noes!" It's trivial to implement a session id then disconnect the inactive tab(s), and has been for a long time.
On the other hand, his recommendations in the comments are pretty solid, so he's not completely off...
Better yet, use C#/ASP.NET with SignalR. SignalR automagically negotiates the best underlying technology (whether that's long polling, SSE, Web Sockets, or whatever), and provides you with a nice little abstraction layer over the top. And it scales quite nicely.
You do realize that socket.io (node) did it first?
I'm all for SignalR and even like C#/ASP.Net (and MVC). That said, there are other factors to consider when developing/deploying an application in .Net ... the abstractions needed for testable C# are particularly annoying imho.
I know it's an area of some debate, but at least to me, greenlets are easier to write and understand than callbacks, by an order of magnitude.
Greenlets work especially well with Python's "with" statement. You can have a database pool like this:
with (yield from pool.connection()) as conn:
users = yield from conn.query('select * from users')
for user in user: ...
Now your database connection is only locked up inside the "with" statement, and the whole thing is asynchronous even though it's written in a synchronous style.
We created a WebSocket client wrapper library to make your third point easier to deal with. It handles only the connection freshness judo and nothing else, and doesn't require a special server or anything. Maybe people here find it interesting: https://github.com/fanout/websockhop
I'm also a heavy websocket user, and agree with most points. I have previously used websockets on top of a traditional web app, and have been disappointed with the results.
My opinion now is that websockets are an all-or-nothing proposition. And I have gone all-in. My latest project has:
- websockets-only api (except a some image uploads and oauth/login)
- https-only
- single page app, central store with observable data, using a single connection servicing the whole application
- using SignalR (supports fallbacks for IE9 and old Android)
- Actor-based backend (Orleans)
This stack feels like the future to me, but I've had to learn a lot of new things. It's quite different from the web development that I've done over the past 15 years. Overcoming the connectivity issues mentioned in the article is only the first step. Websockets need another type of backend to make sense: Actors. I've had many challenges with the async realtime behaviour of the system, deadlocks, etc.
But now that everything is working and I have some grip on it, I can't imagine going back to a rest/database/cache web architecture.
I don't think that's a good idea since it very easily leads to overengineering and becoming accustomed to overengineered solutions.
The way to use new tech is when it gives a clear advantage, not just for the sake of using it. There's nothing wrong with a lot of older, simpler, more compatible solutions.
It's good the parent mentioned blogs, since I think they're one example where much ridiculous complexity has taken hold for little gain; please don't turn what should be nothing more than a widely-accessible, simple static page with minimal bandwidth and processing requirements into an elephantine multiple-layered monstrosity requiring tons of serverside infrastructure and only accessible with the latest browsers.
Always keep in mind YAGNI, KISS.
(This comes from the experience of someone whose work has involved replacing countless overbuilt systems with far simpler ones.)
Not sure what XorNot meant, but the way I've heard it, and it makes sense to me, is that it should be done with toy projects which are not intended to stick around.
Certainly it cost extra time, but it was mostly for learning. In the end it's not much more complex, just different. Some things that used to be easy are harder, but things that were very hard are now very easy. Cache-invalidation no more...
> What does it do? I hope the time investment is worth it and you're not trying to make a blog or some CRUD app.
It's a social network with push notifications and time registration. Users can share realtime location and status. I confess the scalability of the tech is not needed currently, but a fast ui even on very little hardware is still a benefit.
The backend is distributed Actors. Instead of bringing the data to the (web)request, the Actor model brings the request to the data (which may be on another server). Actor objects are loaded in memory and kept there while needed, so its very fast by itself and there's no need for caching at all.
Storage can be very simple, it is possible to use SQL but I'm using Table Storage which is cheaper and more scalable. For searching we use ElasticSearch.
This server model fits very naturally with websockets.
Of course you can mix two styles, but it makes little sense as websockets does all the things REST can do, only faster and with state on the server. Clientside there is little difference between $.ajax(api, data) and conn.send(api, data). And it's just nicer to keep the api together serverside in (in my case) one SignalR controller instead of putting some parts in a Nancy module.
I do have some REST endpoints, i.e. for image upload and oauth, I did not know how to make that work otherwise on IE9.
Correct, but for web apps this is rather different. I found that my thinking was very much request-response. Probably game devs would pick this up much quicker than I did.
I've used websockets in several projects now. They are more problematic than I originally thought they would be.
On my current project I'm using server-sent-events (SSE) for single-duplex pushing to the client. I'm using AJAX calls for the other direction. So far the experience has been awesome. Very light-weight and performant. Just a dozen lines of code on the client with no framework.
Caveat: Doesn't work with IE. A polyfill is required. I am very lucky on this project that I don't need to support IE.
Probably. SSE either just works, or blood starts coming out your nose because you've had a stroke debugging why hello world doesn't work, in which case: you have gzip turned on.
A polyfill is not required: Code for MSIE et al works fine on other browsers, but some padding is required on the server for the same effect: 2k of padding for MSIE < 10 and Chrome < 13, 4k of padding for Android.
Disconnects are difficult to detect as well, so many people recommend sending a message every 15-30 seconds so that you get TCP to notify you, but I recommend just terminating every 15 seconds, and if the client is still there have them reconnect: It wastes a packet, but it means much less load and complexity on the server.
I am using the npm module rexxars/sse-channel. Maintaining connection has worked very well. During debugging I reload the server and/or client repeatedly and things just stay connected. I'm not sure if this is due to the way chrome handles the connection or the quality of the npm module.
If you disconnect the client and the client is behind some firewalls (or in some other circumstances), the server might not get notified until it writes something, but the client will reconnect automatically.
This means that the server now has two connections, and to handle the error-on-write.
If the server periodically writes something, then it will get the error and can recover. If the server does not (however) it will eventually run out of connections.
However another approach is to simply set the socket timeout (SO_RCVTIMEO) every time you write. This disconnects the socket automatically, and the client will (if still around) will reconnect. The savings can be significant on a busy server.
rexxars/sse-channel has a very useful support of history. When a client is disconnected and then reconnected, events that were missed are automatically resent. They use the ID provided by SSE to know what to send. You can count on all events getting through without a lot of overhead.
One thing I think is wrong though is that no ID is sent on the first connection so history isn't sent then. My app needs this so I send the history myself on new connections.
I'm glad to hear other browsers work. I am curious about the padding thing. I don't understand what that has to do with SSE. I thought it was just a new protocol added to the browser.
SSE is plain HTTP, and might be illustrative to look[1] at the HTTP response directly so that you can convince yourself of this.
EventSource is simply an implementation of a parser of those messages.
The problem is that MSIE and Chrome often delay dispatching content-received (XMLHttpRequest readyState=3) handlers until a certain number of bytes have been received (or in certain other circumstances, such as in Chrome's native implementation of EventSource). The bytes-received is solved by simply starting with some padding, and unless you need framing you might find it simpler to just use XMLHttpRequest directly.
See the article (grin). It was mostly due to connection problems. I had to add yet another layer on top of the frameworks I used. Also I effectively had three channels. The two in the websockets and the ajax which I needed for historical reasons. Just an overall feeling of heavy weight compared to the very light SSE.
> Do you think it would be better for a project starting from scratch?
At this moment, yes. Now that I know you can use SSE with any browser, I'm not sure what advantage there is with websockets. Maybe someone can explain.
Out of curiosity, do you know if SSE would pipe through AWS Cloudfront? I can't find any information on this. Performance isn't an issue, it's just that I don't want to expose the backend directly so all requests go through AWS Cloudfront, but I'm stuck hard coding the url to the backend for SSE.
But yeah I agree SSE is much nicer, especially being built in Rails >= 3 and doing regular post requests the other way around.
The CloudFront docs don't mention SSE support and they definitely don't support WebSockets today.
That said, with CloudFront Custom Origins you can do "Transfer-Encoding: chunked" [1] which will cause CDN server to deliver content to the client as it is sent by your application. That seems like it would be compatible with SSE. Unfortunately it also has a 30 second idle timeout [2] which may be a showstopper depending on how frequently your app sends event.
The complaint that the server side of web sockets requires using epoll is strange. Of course we have known for years that select has scalability problems, but we have moved on. That this implementer says they normally use select in 2015 for networked services is worrying.
If using epoll and kqueue directly is not your cup of tea, just use libuv or one of the other wrappers. It's not hard then. I know because I'm a native epoll user but recently helped someone on Stack Overflow and it took me maybe fifteen minutes to write a working libuv server having never used libuv before.
Most of this gotchas (and others not mentioned) can be avoided if you treat WebSocket for what it is: a low-level transport that can be swapped out. This is how we treat it in Socket.IO and Engine.IO.
Most people that implement WebSocket directly end up re-inventing a protocol on top of it. This is necessary because the "onmessage" handler of WS is not something you can really build an application on. And then you'll have to add reconnection and exponential delays after that.
If you just want to use HTTP polling, which as the author suggests is perfect for most applications (perhaps not games), you can simply do: `io('http://myhost', { transports: ['xhr-polling'] })`.
I disagree. I am currently all in for Websocket in web apps. HTTP is only used for static resources and scripts loading.
In all webapps, there are buttons. In some cases, clicking a button requires an action on the server, and therefore a roundtrip. If Websocket connection is established (you don't need more than one for everything), there is no overhead. With HTTP, you need to build a request every time.
Websocket plays nice with modern reactive streaming architectures. You send stream of events, you receive stream of updates. Everything is transparent and immutable and asynchronous.
If there is a connection loss, you are immediately notified and can move your app to offline mode. This is problematic with HTTP.
Websocket enable web apps comparable with native apps. HTTP apps will always feel less responsive to user actions.
I'm just curious, why did you use sockoweb instead of spray.io ? We have similar needs for a webserver that supports websockets and akka - spray.io seems like it would fit the bill.
A little too overengineered for my taste. I tried akka-http which is essentially Spray 2.0 and succeeded, but with twice the amount of code that I had with Sockoweb.
FYI, there is a bug in Chrome that causes WebSockets to be slow [1]. For small messages, the transfer rate seems to be sufficient in most cases. But for larger messages (say, uploads), don't place your bets on WebSockets just yet.
I made an upload tool that needed to handle very large files, and I used WebSockets to do it.
The strategy I used was to parse the MOV/MP4 locally, consider between 2-60 frames of data, hash it, then stream up the hashes to the server. Any hash the server didn't have got a reply to push those bytes.
The two-way nature of this dialog made WebSockets a good fit, although it certainly could have been emulated using other methods, it would have been much more code.
It seems like it wouldn't be much more code to just use an AJAX POST for the hash and have the server reply whether it has the hash or not, followed by the chunk upload again over AJAX. It would probably be slower and less data efficient, but I don't see how it would be much more code.
One reason is that you have more control over contention over bandwidth. For example, if you have control over the outgoing packets, you can temporarily stop those packets when more important data needs to be sent.
I agree with most of the points in this essay. I wrote http://firehose.io/ a few years ago for Poll Everywhere and we've successfully scaled WebSockets in a production environment. We don't use it for 2-way communications though; it's HTTP requests in and WebSocket pushes out with long-polling as a backup for non-WS clients. Check out the sourcecode if you want to see an implementation of all the "hard lesions won" in this essay.
I also gave a talk about it at RailsConf (http://www.bradgessler.com/talks/streaming-rest/) and learned that RabbitMQ as a backend isn't a great idea. Feel free to ask questions if you want me to dive into more specific points.
I found that it was difficult to deal with a web client connection dying, then reconnecting, and recieving messages it may have missed while disconnected. There's a flag you can set for this in Rabbit when creating a queue, but I couldn't really get it to work.
The way Firehose works the client gets a sequence ID per channel that's always counting up. If the connection drops when the client is on sequence 1, then 4 new message come in, and the client reconnects, it will get the rest of the messages up to sequence 5 and wait for new messages.
A goal of Firehose is to have the fewest number of transports possible so the client doesn't have to spend much time negotiating the transport. I found WebSockets was more widely supported (http://caniuse.com/#feat=websockets) and HTTP long polling could take care of the rest.
Before I built Firehose I looked at socket.io, which had a ton of transports, but found that it was too slow, flakey, and unpredictable for our needs.
I've had good results with SSE in just the situation that GP describes: client actions come in as POSTs and the results of those are communicated to everyone via SSE. In most cases we don't even care about the 204 or whatever. If it's not on SSE then it didn't happen. b^)
WebSockets are a tool like any other and as usual, picking the wrong tool for the job is a great way to shoot yourself in the foot. Agree with the author that they aren't suited to replace HTTP but I don't see the need for this apocalyptic tone. WebSockets have been available in many tech stacks for a while now and we don't see much abuse. Developers disciplined enough to stick to Rails don't seem overly prone to jump on the new shiny toy without thinking.
Web Sockets are great to bring some of the more heavy duty apps from Desktop/TCP to the web. PubNub, Pusher and the likes show it is possible to achieve reasonable scale.
Aren't full duplex connections normal outside of the web?
I mean, this isn't really a new kind of tech. It reads a bit like someone who rode horse his whole life and found out that there are cars and now fears about all the dangers they could bring.
Similarly to WebGL it will become just another technology that gets abused. Safari has an option to ask before enabling WebGL for a website. I had to turn it off after a while because basically every website was asking for WebGL. I don't mean websites with fancy 3D graphics or even 2D graphics like Google Maps. It's normal news websites that just show text and maybe some pictures.
Jeez web sockets really aren't that hard. We've used them for a long time on our production site for a live attack map: https://www.wordfence.com/ and we regularly have over 1000 concurrent visitors on the site when we promote a blog post or do something similar.
I can't think of any reason why I wouldn't use node.js on the server-side for web sockets. It obviously uses epoll() and you really need an event language to be comfortable writing an application that manages a large number of concurrent connections. Ruby for this makes me shudder - sorry it just feels like a square peg forced into a round hole that's late to the party.
I'm hardly a Ruby fan - nothing against it, I just prefer Python - but what makes it such a bad choice vis-a-vis JavaScript? Sure, Rails isn't built with this in mind, but why would Ruby with some kind of libuv-based framework be such a terrible choice?
It's not meant for it. Python is similar, there are hacks (such as eventlet) that have their own downsides. Use the right tool for the job. We are currently rewriting an eventlet-based part of our stack in Go with much better results.
Yeah, I can see how Go has some built-in advantages over Ruby and Python in this regard, but I was really looking to understand why mmaunder considers JavaScript a much better choice than Ruby.
Sorry, just read this now. JS's core DNA is as an event language because it was designed for the browser UI which is all about events. Through a wonderful turn of events someone slapped V8 on the server and people started dabbling with JS on the server at the same time as epoll() and/or kqueue emerged and started being used to amazing effect by nginx and other projects.
And so we ended up with this sweet spot of a fast event platform with a super accessible language and a way to handle hundreds of thousands of connections with almost no CPU load. So it's not really about the deficiencies of Ruby (which is awesome for what it's good at) but it's about how incredibly strong node.js and epoll() is for doing anything that would normally be designed with multiple threads and blocking architecture.
So with regards to my square peg/round hole comment. I didn't mean sarcasm - what I mean is that node.js is just so incredibly great at any task where you're waiting on stuff that would block in old-school multi-threaded architecture - both in terms of performance (no load when nothings happening), architecture (a single thread and event loop does it all) and programmability (event handlers, callbacks closures and the syntax are so great for this kind of coding).
The thing is, Ruby can use epoll/libuv as well, so how is that an advantage of Node.js? So the only difference is in the languages themselves, and I don't see where Ruby lacks - after all, the callback pattern is pretty embedded in the language with blocks, and arguably more pleasant to use than JS functions.
This is a good article, albeit a bit long-winded. In my experience, whether or not WS can be beneficial essentially comes down to whether or not you truly need low latency duplex communication. A chat application, which seems to be the canonical WS example, probably doesn't; a twitch-style multiplayer game, probably does.
Server-sent events are a great tool for streaming server-to-client data where other more specific media types (i.e. audio/video) doesn't fit the bill.
Not mentioned is a common pitfall with serving web sockets: it is very easy to end up in a situation where you may not recover from disruptions.
The problem is that establishing a web socket, especially when using SPDY or HTTPS (which is typical), is much more expensive than maintaining a connected one. So a service will typically see a steady creep up in the number of connections, with a relatively low number of new connections per second, and it's probably been tuned for massive levels of concurrency, and that's all manageable. But then the service has a blip of some kind all of a sudden all of those web socket clients try to connect at the same time; disaster.
At that point the service is incredibly overloaded, so many connections fail, which results in retries, which makes matters even worse still and fuels a vicious cycle. That situation might self-stabilize if browsers implemented exponential back off, but that's not typical either. So the service remains hosed until some kind of blood letting can be done to let clients back in at a manageable rate.
Extremely pernicious, and something I've seen several times. When scaling web-sockets, it's important to ask: can I handle all of these clients reconnecting at the same time?
Is there a formal term for this websocket reconnect stampede? What serverside stack and language were you using? What clientside language and libraries? How many users were active and how many CPU cores were dedicated to the websockets? Were you load balancing with or without sticky sessions? Was your websockets a public broadcast or did you do authentication checks for rooms/channels? Have you considered doing a formal writeup over this pitfal and how it was resolved for your project?
Browsers do not have any native reconnect behavior for WebSockets. You have to implement reconnect in your application, which usually means exponential falloff. There are a few open-source implementations of this very feature.
Since you control the reconnect behavior, you could have clients wait a random amount of time from 1-5s before attempting the first connection. That way your reconnect storm is spread uniformly over a broader time frame.
If you deploy websockets in your application, either make your application can use some fallback like long polling or make sure you don't have any enterprise customers. They tend to have pesky proxies that mess with traffic. Sometimes websockets simply doesn't work, but sometimes they initialize correctly but no data passes through(!). It's a mess.
Summary of comments here seem to suggest that those with more extensive experience with them agree and see issues with websockets.
Those still in favor / find themselves disagreeing, while not necessarily always saying explicitly, seem to be those just starting out on their websocket journey.
I'd disagree with that. I have a lot of experience with web sockets and I'd say the blog post is short sighted in a bunch of ways. Having done a large amount of TCP socket programming in other environments outside of web apps, I don't really have the same issues with debugging that other seem to complain about.
Arguing that "WebSockets are bad, because the principals we use for HTTP scalability are not applicable to them" is not wrong, but feels short sighted. Yes: Scaling numerous persistent connections is hard, seeing that most of the available infrastructure is geared towards short-lived HTTP request/response requirements. But there are realtime specific alternatives. You wouldn't want to build your realtime backend from scratch, but rather use something like http://deepstream.io which takes care of the heavy lifting.
If you're interested in implementing reliable long-polling as described in the article, you might check out Pushpin (http://pushpin.org). It's a proxy server that you put in front of your API backend and it makes this sort of thing really easy.
Pushpin also supports WebSockets, but the long-polling capability is first class and not an afterthought.
As a quite heavy user of websockets, and having done a few implementations of them, I agree only partly with it:
Yes, Websockets are hard to implement, and most implementations that are floating around (e.g. on github are horribly broken). But the same probably applies to good HTTP and especially HTTP/2 implementations too!
I also totally agree that web sockets make load balancing and proxying more complicated than stateless HTTP requests.
But I still think that Websockets have their place. They are some kind of more basic building block (like TCP), on which you can build your own protocol with totally different semantics. If you only build some request/response protocol on top of it you probably have not gained much - but still it would be different from HTTP in the way that message ordering is guaranteed. If you only need events from the server to the client you could use SSE, but if you need to add some dynamic subscribe/unsubscribe functionality you would need some additional HTTP [POST] calls. You could get the idea that you use one SSE endpoint per topic and treat the stream disconnect as an unsubscribe, but as long as you can't rule out that you have any non HTTP/2 clients it won't work because of the connection limit in browsers. I also benchmarked that approach with node and golang http/2 against my websocket based protocol and achieved only a throughput of about 10% of the websocket based solution.
All in all I currently think that currently one should favor pure HTTP if scaling is a big issue and if most state is anyway persisted in some distributed databases.
If only a single server is the remote peer, the state is not necessarily persisted per client, a combination of RPC and event distribution semantics is needed, message ordering guarantees are necessary or high realtime characteristics are needed then websockets are a very powerful option seriously needs to be considered.
My personal use-case is forwarding data from automotive realtime bus systems into web-based user interfaces and I had great success with using websockets to achieve this. Although I'm still also looking into HTTP/2, because it really has some interesting properties.
One other thing I would like to add: One comment in the article mentions to always prefer websocket frameworks like socket.io or signalR. What you should take into account is that by using one of those frameworks you are also binding yourself to a specific technology stack (node, C#, ...), if you are not willing to reimplement the protocols. If you use your own well-specified protocol or another well-specified higher level protocol (like WAMP), then you are not bound to specific technologies for your client and server applications. Therefore I always opt to go that way if long maintenance, open APIs or good interoperability are desired.
Can't stand the cargo cult thinking/writing here. The author seems to think websockets are dangerous like handling a piece of glass.
It's a technology like anything else. See if it meets your requirements, if it does, great, go for it and learn along the way. If not, stick to regular connections.
I agree with the approach indicated by the title ("caution required"), but the tone of the article veers toward being alarmist.
Takedown articles tend to share similar characteristics. Many are on a
mission that emphasizes the
negatives of their subject, even minor ones, while overlooking the positives
and handwaving away problems with the alternatives.
They set the bar higher for the subject by finding issues that it
shares equally with its alternatives, and presenting them as damning evidence against it (since users may expect it to be better and be disappointed). The specific alternative that's
held up as superior to the subject varies from point to point--no one
alternative is actually superior in all points. They may even acknowledge that
many of their arguments are weak, but make up for it in volume.
In its favor compared to such articles, this one does make some attempts to be evenhanded, but they're
overpowered by the dramatic language used to make the opposing points.
(For example, the brief acknowledgement that point 1 is the "weakest argument
against WebSockets"
is surrounded by phrases like "wreak havoc",
"weird stuff will definitely happen", "rogue proxies will break you").
This article puts itself squarely in the category of "the case against".
A more balanced article might discuss:
1. The situations where you might want to avoid WebSockets (no HTTPS, IE 8 & 9 compatibility, a server platform without good support, or where they don't provide actual usability benefits).
2. Pitfalls that both long-polling and WebSockets share (you have to handle reconnections and synchronize state, be mindful of the open connections when reconfiguring a load balancer, account for server load, and use a server platform with good support for long connections).
3. Differences that can be viewed just as reasonably in WebSockets' favor, which
the article counts as negatives. Example 1: Resynchronizing state after a network interruption is necessary with both long-polling and WebSockets, though the article notes it as a WebSockets disadvantage. The reconnection process is explicit with WebSockets, but some long-polling libraries make it very easy to ignore these events, leaving the client out of sync. Example 2: WebSockets aren't limited to 6 connections, unlike long-polling, so you won't get a silent connectivity roadblock on some tabs when the user has several of them open.
4. Advantages unique to WebSockets: responsiveness is improved due to not needing to send HTTP headers with every request and response, and the underlying TCP connection is already warmed up; load balancing and redundancy may be simplified by no longer requiring a session affinity mechanism (because the connection itself preserves short-term state, which can be
re-established by the client transparently to the user in the rarer event of reconnection); less overhead than long-polling; a whole class of problems resulting from cookies as the sole mechanism of state transfer is avoided; the same protocol can be implemented simply and efficiently by native mobile apps.
So, to me it feels like the 1-connection-per-tab feels like saying "someone launched 50 instances of an executable and it opened 50 TCP connections".
Examples of where I think it's acceptable/recommended to use Web Sockets include /(video|audio)?/ chats, multiplayer games, collaboration tools, etc. Stuff that requires real-time notifications.
Does a discussion platform require that time precision? I doubt it (frankly, I think WebSockets would give a too high "temporal resolution" unless there were proper delays in place which just complicates the implementation, but that is actually irrelevant here).