It took a decent amount of digging, but it seems like this is trying to be a browser-compatible pubsub system more than a direct websocket replacement. The main thing they're doing is introducing a "mercure hub" between event producers and consumers (browsers) that handles delivery through SSE in a way that tracks subscribers and handles sporadic or interrupted connection.
Most of their clientside examples look pretty simple but they don't seem like they're fully implementing the logic described in "Reconnection, State Reconciliation and Event Sourcing" which seems rather complex. Maybe I'm missing something but it seems like that logic is the entire reason to use this over SSE alone.
Indeed that's how it works. One of the key point of the solution is that you don't need anything client-side. The native EventSource class is all you need (but you can use more advanced SSE client libraries if wanted).
Reconnection and state reconciliation are achieved automatically. The hub implements all the needed features: it stores sent events and automatically resend them at reconnection time if they have been lost. It's possible to do this transparently because browsers will automatically send the ID of the last received message in a Last-Event-ID header when reconnecting. This feature is just often not implemented by SSE servers (because it’s not trivial to do).
It's also possible to ask events received since a specific ID, matching one or several topics just by passing the query parameters defined in the protocol section.
By the way, we are working on a new website that will make these things more clear.
Hi Kévin! A while ago, I’ve built a hub implementation using Typescript and Deno, mainly for learning, but also to see if I could come up with a neat solution for distributed event storage in-memory, using Raft. It implements the full spec, so far, and is pretty easy to understand, if I may say so myself.
Do you still accept entries for alternative hubs? It may be helpful for others to understand how the specification is supposed to work; some parts were a little opaque to me at first, and required digging into the reference implementation to fully grasp.
And it doesn't appear you have an associated working group (WG) for your (expired) draft publication, which could help you identify if you are reinventing an existing wheel..
WebSub cannot be used by browsers because browsers cannot receive POST requests. WebSub is for server to server communications.
Mercure is basically WebSub over SSE.
The draft has been discussed several times on the HTTP WG.
>The main thing they're doing is introducing a "mercure hub" between event producers and consumers (browsers) that handles delivery through SSE in a way that tracks subscribers and handles sporadic or interrupted connection.
This is the singgle biggest issue with using SSE, so I'm intrigued.
WebSockets are hard to secure (they totally bypass CORS as well as other browser built-in protections), don't work (yet) with HTTP/3 and for most use cases require to implement many features by yourself: reconnection in case of network failure, refetch of lost messages, authorization, topic mechanism…
Mercure, which is built on top of SSE (it's more an extension to SSE than an alternative to it) fix these issues.
However, SSE (as well as WebSockets) can be hard to use with stacks not designed to handle persistent connections such as PHP, serverless, most web servers proxying Ruby, Python etc apps. Even for languages designed to handle persistent connections, it's often more efficient and easier to manage persistent connections with ah-hoc software running on dedicated hardware.
That's what Mercure allows. Mercure provides a "hub", a server that will maintain the persistent connections with the browsers, store events, and re-send them in case of network issues (or if asked for old messages by a client). To broadcast a message to all connected users, the server app (or even a client) can just send a single POST request to the hub. The hub will also check that clients are authorized to subscribe or publish to a given topic (that's why JWT is used).
The reference implementation is written in Go, as a module for the Caddy web server, and his very efficient/optimized (it can handle thousands of persistent connections on very small machines).
Install a Mercure hub and you have all these features available without having to write any code. Client-side, no SDK is required, you can embrace the built-in EventSource JavaScript class.
You can reimplement the Same Origin Policy serverside by checking that the Origin header equals the Host header. Even more secure would be to check both against an allowlist (this protects against DNS rebinding, which the Same Origin Policy doesn't protect against).
>as well as other browser built-in protections
I'm curious what those are.
>for most use cases require to implement many features by yourself: [...] authorization
Isn't auth of websockets generally the same as auth of any Javascript-initiated HTTP request (e.g. fetch())? Check that the cookie looks good? Now, in the cause of OAuth tokens, websockets are more difficult than fetch(), because you cannot attach an Authorization: Bearer header to a websocket. But OAuth is less common than cookies for websites.
Once the connection is upgraded, you loose all metadata included in the HTTP headers (because it’s not HTTP) and all protections relying on it.
Also CORS and SOP can be bypassed: https://dev.to/pssingh21/websockets-bypassing-sop-cors-5ajm
Of course you can reimplement everything by hand (and you must if you use WebSockets), but with SSE/Mercure you don't have to because it's plain old HTTP.
> Once the connection is upgraded, you loose all metadata included in the HTTP headers (because it’s not HTTP) and all protections relying on it.
The Upgrade request is HTTP and you can extract all needed metadata from there and store it server side as needed. Those metadata wouldn't change during an active WebSocket session anyway, would they?
The auth headers (Authorization, Cookie) are all passed along, and that's what I want to establish a secure connection from the browser.
For more customized wishes there's always this "ticket"-based flow[0][1] that shouldn't be hard to implement. I might be a bit naive, but what needed metadata and custom headers are we talking about?
Like I said, you can reimplement SOP by checking that the Origin header equals the Host header. Or alternatively check them against an allowlist. It's just a couple lines of code, and then it's as secure as normal HTTP.
>The initial handshake occurs via HTTP Upgrade request, the response body is disregarded and the HTTP/HTTPS protocol is upgraded to WS/WSS protocol.
But the response body isn't disregarded. What could best be described as the response body is the server->client websocket messages. Those aren't disregarded.
Thanks, that's an interesting vulnerability. I'm not so sure it's websocket-specific though. It looks like the attack could be done without websockets at all. The end request that does the malicious action is plain HTTP. The initial request that triggers the desync uses a Connection: Upgrade header, which is used in websockets, but isn't actually websocket-specific. Connection: Upgrade was created in RFC 2616 from 1999, which was before the websocket RFC 6455 in 2011.
The vulnerability was there was a proxy, and the proxy didn't properly handle the Connection header as specified in RFC 2616. RFC 2616 says that a proxy must remove the Connection header if it's not handling it itself, and the proxy in this case wasn't handling it itself (not checking for 101 response).
I would say this vulnerability is an HTTP request smuggling vulnerability. That's a vulnerability where there's a proxy that implements some type of security, and the proxy can be tricked into sending a request to the backend that the proxy didn't actually parse. HTTP request smuggling vulnerabilities impact regular HTTP, not just websockets. This has nothing to do with the Same Origin Policy.
> WebSockets are hard to secure (they totally bypass CORS as well as other browser built-in protections), don't work (yet) with HTTP/3 and for most use cases require to implement many features by yourself: reconnection in case of network failure, refetch of lost messages, authorization, topic mechanism…
Having written WebSocket CORS with Authentication (Cognito) I know this isn't quite true. The initial connection is a standard HTTP request that returns a 100 series, with preflights and everything. That initial request has all the headers you might want to sent to a server for auth. It's a bit odd IMO but the auth string is sent in the web socket constructor, second argument, really easy to miss. Happy to provide both server and client code examples.
You might consider updating the readme to make it clear what Mercure/SSE does that WebSockets doesn't.
If it was me, I'd also put in the readme what WebSockets does that Mercure/SSE does not, such as bidirectional communication, low latency client-to-server messages, binary data transfer, etc.
Exactly this, the landing page seem much more about the reference implementation than the actual protocol. I will say the docs at https://mercure.rocks/spec are nice.
I have been using Server Sent Events with PHP since 2017 and it works very well. All you need is the right headers and a loop that breaks on user disconnect. Honestly it was super easy and had no issues setting it up and getting it to work.
It might be worth mentioning to the parent poster that SSEs (and Mercure) are unidirectional only (server -> client), whereas WebSockets are bidirectional.
Nowadays that's hardly an issue because SSE is multiplexed over HTTP/2 and HTTP/3 alongside regular requests. In some ways, SSE is today more efficient than websockets too because if you use it with HTTP/3 then you're avoiding head of line blocking which is still an issue with websockets.
(Also in practice many places websockets are not even supporting HTTP/2 because on the server they were implemented on top of raw sockets which is no longer possible with HTTP/2)
Head-of-line still exists in HTTP/3, but on another level. A stream can still get blocked. With websockets the whole connection gets blocked (HTTP/1), but that connection isn't shared with anything else, right?
Worth noting that HTTP/3 seems to recover faster than TCP-based protocols.
They seem to handle a lot of nontrivial issues I have to deal with frequently like synchronization, but weirdly enough they do it with JWTs. For me the entire point of SSEs is that I can avoid using JWTs and use standard session logic which is very easy to reason about.
By the way I'm sure JWTs are fine not trying to step on any toes I'm just not an expert with them and I know there are footguns so with security stuff I stick to the most boring technology I have access to.
I have certainly seen people reach for websockets to just reverse the direction of data flow from server to client. On HTTP/1.1, the main issue people can hit is the max connection limit.
This connection limit does not apply when using SSE over HTTP/2+, because server connections are multiplexed onto a single connection.
I also prefer starting with SSE where possible, as it is simple conceptually, easy to implement (even from scratch), and doesn't introduce a secondary client to server path. Having that ability (even if unused) tends to create a temptation to use it, and when that happens it introduces a choice of whether to use normal HTTP requests or make an equivalent request over websocket.
Using websocket for things HTTP can handle is a mistake, in my opinion, because HTTP is simpler than websocket to fetch state.
Of course, if you need websocket, you need it, but there are some that just want to play with it in production or add it to their resume. I suspect that dynamic is what causes the 'cargo culting' you talk about.
Going off my personal experience I don't see SSE being mentioned anywhere at all. But I always come across some new shiny web framework/wsgi listing websockets as a feature.
The reality is a lot of applications simply do not need bi-directional client-server communication but for some reason this abuse has been adopted as standard.
> The reality is a lot of applications simply do not need bi-directional client-server communication but for some reason this abuse has been adopted as standard.
I suspect the treatment as 'standard' is what thread OP is talking about. See my sibling comment in the thread. Unless you absolutely need websocket, SSE is simpler, and due to that, better.
I suspect the reason many select websocket over SSE is due to its limitations (connection limit under HTTP/1.1, no client->server sends), but SSE for server->client events and normal HTTP requests for state fetching is better architecturally for bog standard apps in my experience.
It is targeted to the PHP community, not the language, it has no dependency on that, it is written in go and is based on caddy. (I’ve contributed with one PR in my former job)
It's really not. Not properly anyway. You'd spend a week, probably, just getting all the synchronization stuff right. Unless of course, you don't care about losing messages during a disconnection. Or you could just install this in a couple of hours and be done with it; not have to worry about security, maintaining NIH libraries, etc.
It is very easy, I have used it on various projects including chats, live dynamic dash boards, prompters and games. The SSE setup took less than a day, never had issues with sync. Never lost messages during disconnection or any issues.
I don’t know how you’ve been using SSE, but if you’ve had all these issues you must have been doing something very wrong.
Also I could recommend https://github.com/slact/nchan. It has the same idea: hide and abstract pubsub complexity for a backend service. nchan is built on top of nginx and could be more convenient (existing nginx configuration knowledge) to deploy.
That is if the connection isn't downgraded by some other mechanism. Lots of corporate clients downgrade connections to HTTP/1.1 due to how their network is setup.
There is also an issue with SSE I've noticed that websockets doesn't seem to have as much of a prevalent issue with, and that is being blocked by corporate firewalls.
Most websites use TLS, HSTS and the like nowadays. It’s hard for a corporate proxy to downgrade the connection.
Also, the limitation is only applied by browsers, not by (reverse) proxies and other non-browsers clients.
That being said, Mercure allow to subscribe to many "topics" using a single HTTP connection. This helps mitigating the limitation with HTTP/1 (which is very uncommon these days).
Reminds me of WAMP (Web Application Messaging Protocol) [0], which is a WebSocket subprotocol.
I find the title odd, because, why would you want to replace server-sent events with WebSocket, if the great thing about SSE is the simplicity, both client- and server-side?
We've been using Mercure for 3+ years in production. It's been pretty much set it and forget it. Maybe once a year do we have to maintain the server e.g. clear up space, but overall has been great for our use case (which does not need full pub/sub).
Yes, as it 's "just" SSE, this works pretty well behind Cloudflare and other similar solutions. Mercure also automatically sends the NGINX-specific headers required to allow unbuffered connections.
Native mobile apps are also entirely supported. It's a very common use case. Most mobile languages have SSE client libraries.
The biggest issue with SSE is that there is unfortunately alot of deployed software that will downgrade incoming connections to HTTP/1.1 so even if the client would have otherwise supported HTTP/2 or higher its not presented the ability to do so.
Was there a specific reason to use AGPL-3.0? Not critizing, just asking.
Tried to read about the license and was greeted by a tl;dr summary of the AGPL-3.0 license [1]. I am no lawyer but my gut tells me that providing such a summary is an invitation to strange disputes. Take care.
Also not a lawyer, but discussed this with multiple lawyers. As long as you make very clear that you're summarizing the legal document, but the legal document you're providing is the canonical truth, you're allowed to provide those kind of summaries.
For example, Creative Commons has a visual/bullet point explanation of their licenses. That's entirely okay, as the legal text is the core license.
I had a similar discussion with a lawyer once about a TOS that also included a summary. The lawyer told me that as long as you make it clear that your summary is just a summary, and is not the agreement, and you point out the actual agreement, you're okay.
In this case, the OP is pointing to the legal text clearly and merely summarizing it's most salient points.
However what confuses me is that they seem to focus mostly on the protocol, hub, and publish implementation. At least based on a quick glance at https://mercure.rocks/docs/ecosystem/awesome and elsewhere it seems they expect you to use raw SSE clientside according to their protocol described in eg https://mercure.rocks/spec#active-subscriptions.
Most of their clientside examples look pretty simple but they don't seem like they're fully implementing the logic described in "Reconnection, State Reconciliation and Event Sourcing" which seems rather complex. Maybe I'm missing something but it seems like that logic is the entire reason to use this over SSE alone.