WebSockets are great when used in addition to polling. This way, you can design a system that doesn't result in missed events. Example: have a /events?fromTS=123 endpoint.
At FastComments - we do both. We use WS, and then poll the event log when required (like on reconnect, etc).
Products that can get away with just polling should. In a lot of scenarios you can just offload a lot of the work to companies like OneSignal or UrbanAirship, too.
If you're going to use WS and host the server yourself, make sure you have plans for being able to shard or scale it horizontally to handle herds.
It was hard for us to not use websockets, since like 70% of our customers pick us for being a "live" solution for live events etc.
Depending on what you are trying to achieve, I also recommend SSE over websockets, especially if all you want is to signal clients when state changes on the server.
SSE is a simple protocol that you can easily implement yourself both in the server and the client if the client lacks support.
SSE also adds naturally to existing infrastructure of request-response if you only use SSE for notification and keep everything the same, i.e. use same endpoints as before for fetching new data on a SSE notification event, and thus can be turned off as easily if problematic, e.g. too high server load.
Yeah I agree. But even though SSE is super easy to grok and to implement (literally just standardized long polling), lots of existing infra builds on the assumptions that connections are short lived, so many of the WS issues apply to SSE as well.
IMHO, this unfortunate assumption is not really defensible in $current_year - especially from the multi billion dollar Cloud industry. I'd much more prefer first class support for long-lived connections on an infrastructure level, as opposed to a "proprietary database-level". I don't buy the argument that it's infeasible to solve the thundering herd issues.
I remember when I first heard about websockets, I was wondering what exactly it was useful for that SSE didn't already do. Almost all of the demos at the time were easier (IMO) to do with SSE. The two standards also both came from WHATWG at about the same time.
[edit]
I looked it up and SSE was a much earlier standard, but implementation of WS and SSE were relatively contemporary with the exception of Opera (had SSE in 2006) and IE (Never got SSE support).
I didn't know that SSE came first, thanks for adding that context.
It does feel like websockets tried to cram several novel features whereas SSE was simply giving proper clothing to the existing art of long polling.
In particular, WS is binary encoded, has support for multiplexing/message splitting, several optional http headers, which in hindsight appears to have simply complicated the spec at little-to-nil value.
SSE have a lot of restrictions that make them unattractive, like a global limit of 6 per browser session. This can cause confusing behavior for power users...
Hm yeah now that you mention it I recall that as well. Isn't that just an arbitrary crippling though? I can't imagine a good reason for why SSE would be hogging more resources than websockets.
Is it really worth the extra effort for WS over long polling at that point though? Especially if you're re-using the TCP connection it seems like the overhead would be minimal and the latency only slightly increased.
Sorry for the misunderstanding, but I don't mean WS over long polling. I mean WS in addition to polling, not long polling. Use websockets, but also expose an API to get the same events by specifying a timestamp. This way the websocket server implementation can be much simpler, and the client just has to call the API to "catch up" on missed events on reconnect.
You can also use this API for integrations, and your clients/consumers will thank you. For example, our third party integrations use the event log to sync back to their own data stores. They probably call this every hour, or once a day. You wouldn't want to use websockets with PHP apps like WordPress.
I can totally relate to that. When designing the RxDB GraphQL replication [1] protocol, it made things so much easier when the main data runs via normal request-response http. Only the long-polling is switched out for WebSockets so that the client can know when data on the server has changed. This makes it realy easy to implement the server side components when having a non-streaming database.
You're right which is where I ended the conversation.
Ideally, you should be able to poll because it is resilient. The challenge however is when you separate the initial poll/pull from the update stream because now you have to maintain two code paths. What I'm proposing is that the poll and update stream use the same data format using patching.
Why a separate poll instead of adding the initial offset to the websocket request url or handshake? Just to be compatibly with websocket hostile networks?
At FastComments - we do both. We use WS, and then poll the event log when required (like on reconnect, etc).
Products that can get away with just polling should. In a lot of scenarios you can just offload a lot of the work to companies like OneSignal or UrbanAirship, too.
If you're going to use WS and host the server yourself, make sure you have plans for being able to shard or scale it horizontally to handle herds.
It was hard for us to not use websockets, since like 70% of our customers pick us for being a "live" solution for live events etc.