This feels ill advised and I don't believe that HTTP streaming was designed with...

skrebbel · 2025-04-12T10:42:02 1744454522

> I don't believe that HTTP streaming was designed with this pattern in mind

> server-sent events are a simpler solution

Fwiw Server-Sent Events are a protocol on top of HTTP Streaming.

In fact I'm somewhat surprised that the article doesn't mention it, instead rolling their own SSE alternative that looks (to my non-expert eyes) like a lower level version of the same thing. It seems a bit weird to me to use chunks as a package boundary, I'd worry that that has weird edge cases (eg won't large responses be split into multiple chunks?)

benwilber0 · 2025-04-12T16:52:05 1744476725

I pretty much always prefer SSE over websockets just because of the simplicity end-to-end. It's "just HTTP", so all the HTTP-based tech and tools apply out-of-the-box without any really special configuration that is required for WS. Curl (or even netcat) "just works", no special client. I don't have to do any special CDN configuration to proxy connections or terminate SSL aside from just turning off buffering.

Websockets requires almost a completely new L7 stack and tons of special configuration to handle Upgrade, text or data frames, etc. And once you're out of "HTTP mode" you now have to implement the primitive mechanics of basically everything yourself, like auth, redirects, sessions, etc.

It's why I originally made Tiny SSE which is a purpose-built SSE server written in Rust and programmable with Lua.

https://tinysse.com

https://github.com/benwilber/tinysse

alt227 · 2025-04-12T17:55:47 1744480547

Everything 'just works', yet you needed to create your own server for it which needs scripting support?

IMO 'just works' means Apache suupports it out of the box with a simple config file and you can just start sending messages to client IPs.

benwilber0 · 2025-04-12T18:41:11 1744483271

"just works" in the sense that this is a complete SSE client application:

    while true; do
        curl example.com/sse | handle-messages.sh
    done

Because it's just text-over-http. This isn't possible with websockets without some kind of custom client and layer 7 protocol stack.

Sammi · 2025-04-13T12:49:05 1744548545

You could do a lobotomized WebSockets implementation that was an extremely thin layer on top of http, similarly to this.

In this way SSE and WebSockets are exactly the same. They are HTTP requests that you keep open. To firewalls and other network equipment both look the same. They look like long lived http requests, because that is what they are.

anamexis · 2025-04-12T20:55:12 1744491312

It’s functional, but I wouldn’t say it’s complete without Last-Event-Id handling.

lxgr · 2025-04-12T17:10:40 1744477840

If you only care about events in one direction, it's a perfectly fine solution, but if you need something other than that, things might get awkward using SSE and regular HTTP calls, even with long-lived HTTP connections.

> once you're out of "HTTP mode" you now have to implement the primitive mechanics of basically everything yourself, like auth, redirects, sessions, etc.

WebSockets do support authentication via cookies or custom headers, don't they?

notatoad · 2025-04-12T19:21:14 1744485674

>If you only care about events in one direction, it's a perfectly fine solution

i feel like clients sending requests to servers is a pretty well-solved problem with regular http? i can't imagine how that could be the difficult part of the equation.

nitely · 2025-04-12T20:26:00 1744489560

not if you need bidirectional communication, for example a ping-pong of request/response. That is solved with WS, but hard to do with SSE+requests. The client requests may not even hit the same SSE server depending on your setup. There are workarounds obviously, but it complicates things.

VWWHFSfQ · 2025-04-12T17:33:29 1744479209

> WebSockets do support authentication via cookies or custom headers, don't they?

It will depend on how the websocket architecture is implemented. A lot of systems will terminate the HTTP connection at the CDN or API gateway and just forward the upgraded TCP socket to the backend without any of the HTTP semantics intact.

Sammi · 2025-04-13T12:45:12 1744548312

Sure. If you need http header / cookie based auth with websockets, then you need the full http request with all the headers intact. This is the common case or at least something for which it is pretty straight forward to architect for.

Authenticating a websocket is just as easy as authenticating a regular http request. Because it is exactly the same.

lxgr · 2025-04-12T19:11:23 1744485083

Interesting, do you have any examples for that? I haven't used WebSockets in such a context yet but was always curious how it would be exposed to the application servers.

notjoemama · 2025-04-12T15:15:18 1744470918

Because of TCP, large chunks are always split into smaller chunks. It’s just that at the HTTP level we don’t know and don’t see it. UDP forces people into designing their own protocols if the data is a defined package of bytes. Having done some socket coding my impression is web sockets would be good for a high end browser based game, browser based simulations, or maybe a high end trading system. At that point the browser is just a shell/window. As others have pointed out, there are already plenty of alternatives for web applications.

jeroenhd · 2025-04-12T16:25:04 1744475104

The problem for things like video games and trading is that websockets only support TCP by default. Technologies like WebRTC allow for much faster updates.

I think websockets certainly have their uses. Mostly in systems where SSE isn't available quickly and easily, or when sending a bunch of quick communications one after another as there's no way to know if the browser will pipeline the requests automatically or if it'll set up a whole bunch of requests.

ranger_danger · 2025-04-12T17:33:38 1744479218

My problem with SSE is that it has a very low connection limit of 6 per domain across the entire browser session.

VWWHFSfQ · 2025-04-12T17:43:01 1744479781

That's an HTTP 1.1 problem, not SSE. Websockets has the same restriction.

CharlieDigital · 2025-04-12T20:10:20 1744488620

You just use HTTP/2. It's a solved problem.

hobofan · 2025-04-12T10:03:21 1744452201

With the current AI/LLM wave SSE have received a lot of attention again, and most LLM chat frontends use them. At least from my perception as a result of this, support for SSEs in major HTTP server frameworks has improved a lot in the last few years.

It is a bit of a shame though, that in order to do most useful things with SSEs you have to resort to doing non-spec-compliant things (e.g. send initial payload with POST).

ljm · 2025-04-12T10:56:39 1744455399

Same with graphql subscriptions.

Arguably it’s also because of serverless architecture where SSE can be used more easily than WS or streaming. If you want any of that on Lambda and API Gateway, for example, and didn’t anticipate it right off the bat, you’re in for quite a bit of pain.

nkozyra · 2025-04-12T11:26:38 1744457198

SSE limitations in the browser are still a drag for this, too.

blensor · 2025-04-12T10:25:42 1744453542

Also MCP uses it

osigurdson · 2025-04-12T15:01:18 1744470078

The issue I have with SSE and what is being proposed in this article (which is very similar), is the very long lived connection.

OpenAI uses SSE for callbacks. That works fine for chat and other "medium" duration interactions but when it comes to fine tuning (which can take a very long time), SSE always breaks and requires client side retries to get it to work.

So, why not instead use something like long polling + http streaming (a slight tweak on SSE). Here is the idea:

1) Make a standard GET call /api/v1/events (using standard auth, etc)

2) If anything is in the buffer / queue return it immediately

3) Stream any new events for up to 60s. Each event has a sequence id (similar to the article). Include keep alive messages at 10s intervals if there are no messages.

4) After 60s close the connection - gracefully ending the interaction on the client

5) Client makes another GET request using the last received sequence

What I like about this is it is very simple to understand (like SSE - it basically is SSE), has low latency, is just a standard GET with standard auth and works regardless of how load balancers, etc., are configured. Of course, there will be errors from time to time, but dealing with timeouts / errors will not be the norm.

thedufer · 2025-04-12T15:19:00 1744471140

I don't understand the advantages of recreating SSE yourself like this vs just using SSE.

> SSE always breaks and requires client side retries to get it to work

Yeah, but these are automatic (the browser handles it). SSE is really easy to get started with.

osigurdson · 2025-04-12T15:42:57 1744472577

My issue with eventsource is it doesn't use standard auth. Including the jwt in a query string is an odd step out requiring alternate middleware and feels like there is a high chance of leaking the token in logs, etc.

I'm curious though, what is your solution to this?

Secondly, not every client is a browser (my OpenAI / fine tune example is non-browser based).

Finally, I just don't like the idea of things failing all time with something working behind the scenes to resolve issues. I'd like errors / warnings in logs to mean something, personally.

>> I don't understand the advantages of recreating SSE yourself like this vs just using SSE

This is more of a strawman and don't plan to implement it. It is based on experiences consuming SSE endpoints as well as creating them.

thedufer · 2025-04-12T20:26:07 1744489567

> I'm curious though, what is your solution to this?

Cookies work fine, and are the usual way auth is handled in browsers.

> Secondly, not every client is a browser (my OpenAI / fine tune example is non-browser based).

That's fair. It still seems easier, to me, to save any browser-based clients some work (and avoid writing your own spec) by using existing technologies. In fact, what you described isn't even incompatible with SSE - all you have to do is have the server close the connection every 60 seconds on an otherwise normal SSE connection, and all of your points are covered except for the auth one (I've never actually seen bearer tokens used in a browser context, to be fair - you'd have to allow cookies like every other web app).

VWWHFSfQ · 2025-04-12T17:47:11 1744480031

> it doesn't use standard auth

I'm not sure what this means because it supports the withCredentials option to send auth headers if allowed by CORS

osigurdson · 2025-04-12T18:22:47 1744482167

I mean Bearer / JWT

CharlieDigital · 2025-04-12T20:12:56 1744488776

SSE can be implemented over HTTP GET; there is no difference in handling of JWT tokens in headers.

osigurdson · 2025-04-14T14:32:00 1744641120

I mean that eventsource doesn't handle it.

runeks · 2025-04-12T16:31:39 1744475499

> Perhaps I'm wrong, but I believe HTTP streaming is for chunking large blobs.

You are wrong in the case of Chrome and Firefox. I have tried it and streaming e.g. unordered list elements are displayed instantly.

But for Safari, "text/html" streaming happens in 512 byte chunks[1].

[1] https://bugs.webkit.org/show_bug.cgi?id=265386

lxgr · 2025-04-12T17:04:32 1744477472

GP is talking about intermediary proxies, CDNs etc. that might be unhappy about long-running connections with responses trickling in bit by bit, not doubting that it works on the client side.

That said, I'd be surprised if proxy software or services like Cloudflare didn't have logic to automatically opt out of "CDN mode" and switch to something more transparent when they see "text/event-stream". It's not that uncommon, all things considered.