You might not need WebSockets

gabesullice · 2025-04-12T09:18:04 1744449484

This feels ill advised and I don't believe that HTTP streaming was designed with this pattern in mind

Perhaps I'm wrong, but I believe HTTP streaming is for chunking large blobs. I worry that if you use this pattern and treat streaming like a pub/sub mechanism, you'll regret it. HTTP intermediaries don't expect this traffic pattern (e.g., NGINX, CloudFlare, etc.). And I suspect every time your WiFi connection drops while the stream is open, the fetch API will raise an error as if the request failed.

However, I agree you probably don't need WebSockets for many of the ways they're used—server-sent events are a simpler solution for many situations where people reach for WebSockets... It's a shame SSEs never received the same fanfare.

skrebbel · 2025-04-12T10:42:02 1744454522

> I don't believe that HTTP streaming was designed with this pattern in mind

> server-sent events are a simpler solution

Fwiw Server-Sent Events are a protocol on top of HTTP Streaming.

In fact I'm somewhat surprised that the article doesn't mention it, instead rolling their own SSE alternative that looks (to my non-expert eyes) like a lower level version of the same thing. It seems a bit weird to me to use chunks as a package boundary, I'd worry that that has weird edge cases (eg won't large responses be split into multiple chunks?)

benwilber0 · 2025-04-12T16:52:05 1744476725

I pretty much always prefer SSE over websockets just because of the simplicity end-to-end. It's "just HTTP", so all the HTTP-based tech and tools apply out-of-the-box without any really special configuration that is required for WS. Curl (or even netcat) "just works", no special client. I don't have to do any special CDN configuration to proxy connections or terminate SSL aside from just turning off buffering.

Websockets requires almost a completely new L7 stack and tons of special configuration to handle Upgrade, text or data frames, etc. And once you're out of "HTTP mode" you now have to implement the primitive mechanics of basically everything yourself, like auth, redirects, sessions, etc.

It's why I originally made Tiny SSE which is a purpose-built SSE server written in Rust and programmable with Lua.

https://tinysse.com

https://github.com/benwilber/tinysse

alt227 · 2025-04-12T17:55:47 1744480547

Everything 'just works', yet you needed to create your own server for it which needs scripting support?

IMO 'just works' means Apache suupports it out of the box with a simple config file and you can just start sending messages to client IPs.

benwilber0 · 2025-04-12T18:41:11 1744483271

"just works" in the sense that this is a complete SSE client application:

    while true; do
        curl example.com/sse | handle-messages.sh
    done

Because it's just text-over-http. This isn't possible with websockets without some kind of custom client and layer 7 protocol stack.

Sammi · 2025-04-13T12:49:05 1744548545

You could do a lobotomized WebSockets implementation that was an extremely thin layer on top of http, similarly to this.

In this way SSE and WebSockets are exactly the same. They are HTTP requests that you keep open. To firewalls and other network equipment both look the same. They look like long lived http requests, because that is what they are.

anamexis · 2025-04-12T20:55:12 1744491312

It’s functional, but I wouldn’t say it’s complete without Last-Event-Id handling.

lxgr · 2025-04-12T17:10:40 1744477840

If you only care about events in one direction, it's a perfectly fine solution, but if you need something other than that, things might get awkward using SSE and regular HTTP calls, even with long-lived HTTP connections.

> once you're out of "HTTP mode" you now have to implement the primitive mechanics of basically everything yourself, like auth, redirects, sessions, etc.

WebSockets do support authentication via cookies or custom headers, don't they?

notatoad · 2025-04-12T19:21:14 1744485674

>If you only care about events in one direction, it's a perfectly fine solution

i feel like clients sending requests to servers is a pretty well-solved problem with regular http? i can't imagine how that could be the difficult part of the equation.

nitely · 2025-04-12T20:26:00 1744489560

not if you need bidirectional communication, for example a ping-pong of request/response. That is solved with WS, but hard to do with SSE+requests. The client requests may not even hit the same SSE server depending on your setup. There are workarounds obviously, but it complicates things.

VWWHFSfQ · 2025-04-12T17:33:29 1744479209

> WebSockets do support authentication via cookies or custom headers, don't they?

It will depend on how the websocket architecture is implemented. A lot of systems will terminate the HTTP connection at the CDN or API gateway and just forward the upgraded TCP socket to the backend without any of the HTTP semantics intact.

Sammi · 2025-04-13T12:45:12 1744548312

Sure. If you need http header / cookie based auth with websockets, then you need the full http request with all the headers intact. This is the common case or at least something for which it is pretty straight forward to architect for.

Authenticating a websocket is just as easy as authenticating a regular http request. Because it is exactly the same.

lxgr · 2025-04-12T19:11:23 1744485083

Interesting, do you have any examples for that? I haven't used WebSockets in such a context yet but was always curious how it would be exposed to the application servers.

notjoemama · 2025-04-12T15:15:18 1744470918

Because of TCP, large chunks are always split into smaller chunks. It’s just that at the HTTP level we don’t know and don’t see it. UDP forces people into designing their own protocols if the data is a defined package of bytes. Having done some socket coding my impression is web sockets would be good for a high end browser based game, browser based simulations, or maybe a high end trading system. At that point the browser is just a shell/window. As others have pointed out, there are already plenty of alternatives for web applications.

jeroenhd · 2025-04-12T16:25:04 1744475104

The problem for things like video games and trading is that websockets only support TCP by default. Technologies like WebRTC allow for much faster updates.

I think websockets certainly have their uses. Mostly in systems where SSE isn't available quickly and easily, or when sending a bunch of quick communications one after another as there's no way to know if the browser will pipeline the requests automatically or if it'll set up a whole bunch of requests.

ranger_danger · 2025-04-12T17:33:38 1744479218

My problem with SSE is that it has a very low connection limit of 6 per domain across the entire browser session.

CharlieDigital · 2025-04-12T20:10:20 1744488620

You just use HTTP/2. It's a solved problem.

VWWHFSfQ · 2025-04-12T17:43:01 1744479781

That's an HTTP 1.1 problem, not SSE. Websockets has the same restriction.

hobofan · 2025-04-12T10:03:21 1744452201

With the current AI/LLM wave SSE have received a lot of attention again, and most LLM chat frontends use them. At least from my perception as a result of this, support for SSEs in major HTTP server frameworks has improved a lot in the last few years.

It is a bit of a shame though, that in order to do most useful things with SSEs you have to resort to doing non-spec-compliant things (e.g. send initial payload with POST).

ljm · 2025-04-12T10:56:39 1744455399

Same with graphql subscriptions.

Arguably it’s also because of serverless architecture where SSE can be used more easily than WS or streaming. If you want any of that on Lambda and API Gateway, for example, and didn’t anticipate it right off the bat, you’re in for quite a bit of pain.

nkozyra · 2025-04-12T11:26:38 1744457198

SSE limitations in the browser are still a drag for this, too.

blensor · 2025-04-12T10:25:42 1744453542

Also MCP uses it

runeks · 2025-04-12T16:31:39 1744475499

> Perhaps I'm wrong, but I believe HTTP streaming is for chunking large blobs.

You are wrong in the case of Chrome and Firefox. I have tried it and streaming e.g. unordered list elements are displayed instantly.

But for Safari, "text/html" streaming happens in 512 byte chunks[1].

[1] https://bugs.webkit.org/show_bug.cgi?id=265386

lxgr · 2025-04-12T17:04:32 1744477472

GP is talking about intermediary proxies, CDNs etc. that might be unhappy about long-running connections with responses trickling in bit by bit, not doubting that it works on the client side.

That said, I'd be surprised if proxy software or services like Cloudflare didn't have logic to automatically opt out of "CDN mode" and switch to something more transparent when they see "text/event-stream". It's not that uncommon, all things considered.

osigurdson · 2025-04-12T15:01:18 1744470078

The issue I have with SSE and what is being proposed in this article (which is very similar), is the very long lived connection.

OpenAI uses SSE for callbacks. That works fine for chat and other "medium" duration interactions but when it comes to fine tuning (which can take a very long time), SSE always breaks and requires client side retries to get it to work.

So, why not instead use something like long polling + http streaming (a slight tweak on SSE). Here is the idea:

1) Make a standard GET call /api/v1/events (using standard auth, etc)

2) If anything is in the buffer / queue return it immediately

3) Stream any new events for up to 60s. Each event has a sequence id (similar to the article). Include keep alive messages at 10s intervals if there are no messages.

4) After 60s close the connection - gracefully ending the interaction on the client

5) Client makes another GET request using the last received sequence

What I like about this is it is very simple to understand (like SSE - it basically is SSE), has low latency, is just a standard GET with standard auth and works regardless of how load balancers, etc., are configured. Of course, there will be errors from time to time, but dealing with timeouts / errors will not be the norm.

thedufer · 2025-04-12T15:19:00 1744471140

I don't understand the advantages of recreating SSE yourself like this vs just using SSE.

> SSE always breaks and requires client side retries to get it to work

Yeah, but these are automatic (the browser handles it). SSE is really easy to get started with.

osigurdson · 2025-04-12T15:42:57 1744472577

My issue with eventsource is it doesn't use standard auth. Including the jwt in a query string is an odd step out requiring alternate middleware and feels like there is a high chance of leaking the token in logs, etc.

I'm curious though, what is your solution to this?

Secondly, not every client is a browser (my OpenAI / fine tune example is non-browser based).

Finally, I just don't like the idea of things failing all time with something working behind the scenes to resolve issues. I'd like errors / warnings in logs to mean something, personally.

>> I don't understand the advantages of recreating SSE yourself like this vs just using SSE

This is more of a strawman and don't plan to implement it. It is based on experiences consuming SSE endpoints as well as creating them.

thedufer · 2025-04-12T20:26:07 1744489567

> I'm curious though, what is your solution to this?

Cookies work fine, and are the usual way auth is handled in browsers.

> Secondly, not every client is a browser (my OpenAI / fine tune example is non-browser based).

That's fair. It still seems easier, to me, to save any browser-based clients some work (and avoid writing your own spec) by using existing technologies. In fact, what you described isn't even incompatible with SSE - all you have to do is have the server close the connection every 60 seconds on an otherwise normal SSE connection, and all of your points are covered except for the auth one (I've never actually seen bearer tokens used in a browser context, to be fair - you'd have to allow cookies like every other web app).

VWWHFSfQ · 2025-04-12T17:47:11 1744480031

> it doesn't use standard auth

I'm not sure what this means because it supports the withCredentials option to send auth headers if allowed by CORS

osigurdson · 2025-04-12T18:22:47 1744482167

I mean Bearer / JWT

CharlieDigital · 2025-04-12T20:12:56 1744488776

SSE can be implemented over HTTP GET; there is no difference in handling of JWT tokens in headers.

hombre_fatal · 2025-04-11T23:41:15 1744414875

It's a minor point in the article, but sending a RequestID to the server so that you get request/response cycles isn't weird nor beyond the pale.

It's pretty much always worth it to have an API like `send(message).then(res => ...)` in a serious app.

But I agree. The upgrade request is confusing, and it's annoying how your websocket server is this embedded thing running inside your http server that never integrates cleanly.

Like instead of just reusing your middleware that reads headers['authorization'] from the websocket request, you access this weird `connectionParams` object that you pretend are request headers, heh.

But the idiosyncrasies aren't that big of a deal (ok, I've just gotten used to them). And the websocket browser API is nicer to work with than, say, EventSource.

syspec · 2025-04-11T23:56:08 1744415768

It's a good well worn tactic. You list in very high detail every single step of any process you don't like. It makes that process seem overly complex, then you can present your alternative and it sounds way simpler.

For example, making a sandwich: You have to retrieve exactly two slices of bread after finding the loaf in the fridge. Apply butter uniformly after finding the appropriate knife, be sure to apply about a 2.1mm level of coating. After all of that you will still need to ensure you've calibrated the toaster!"

hombre_fatal · 2025-04-12T00:15:33 1744416933

On the other hand, we're doing the worse tactic of getting held up on the first tiny subheader instead of focusing on the rest of a decent article.

Also, their alternative is just a library. It's not like they're selling a SaaS, so we shouldn't be mean spirited.

NetOpWibby · 2025-04-12T04:52:27 1744433547

> ...we shouldn't be mean spirited.

Am I on the right website? checks URL

People find anything to be mean about on here.

dizhn · 2025-04-12T06:40:07 1744440007

But it is frowned upon.

procaryote · 2025-04-12T08:17:07 1744445827

The loaf shouldn't be in the fridge, and 2.1mm is way too much butter, especially if applied before putting the bread in the toaster

goosejuice · 2025-04-12T14:30:48 1744468248

Too much butter? You're not living if thats too much butter!

iandanforth · 2025-04-12T09:11:10 1744449070

Sandwich code review is what HN is for.

accrual · 2025-04-12T17:05:36 1744477536

I think we need a function that returns the correct butter height given the dimensions of the input bread. We may also need an object containing different kinds of bread and the ideal amount of butter for each depending on the absorbtion characteristics of the bread, etc. The user's preference for butter might also need to be another parameter.

dcow · 2025-04-12T12:52:36 1744462356

sanwy.ch is the name of the YC25 startup tackling AI sandwich tech.

jongjong · 2025-04-12T00:08:30 1744416510

Pretty much. In this case, WebSockets is simpler to implement than HTTP2; it's closer to raw TCP, you just send and receive raw packets... It's objectively simpler, more efficient and more flexible.

It's a tough sell to convince me that a protocol which was designed primarily for resource transfer via a strict, stateless request-response mode of interaction, with server push tacked on top as an afterthought is simpler than something which was built from the ground up to be bidirectional.

bobmcnamara · 2025-04-12T02:04:46 1744423486

I fixed a few bugs in a WebSocket client and was blown away by the things they do to trick old proxies into not screwing it all up.

fenesiistvan · 2025-04-12T16:39:26 1744475966

I would be interested in those tricks

bobmcnamara · 2025-04-12T20:40:52 1744490452

A big one is 'masking' all client requests that a proxy can't effectively cache the response since the request always changes.

The RFC explains it: https://datatracker.ietf.org/doc/html/rfc6455#section-5.3

AtlasBarfed · 2025-04-12T03:22:04 1744428124

Aren't websockets the only way to some sort of actual multi-core and threaded code in JavaScript, or is it still subject to the single background thread limitation and it just runs like node does?

nothrabannosir · 2025-04-12T03:41:36 1744429296

Do you mean web workers? https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers...

plasma_beam · 2025-04-12T20:51:24 1744491084

You butter bread before it’s toasted? My mind is honestly blown (as I move to kitchen to try this).

waynesonfire · 2025-04-12T06:52:19 1744440739

Absolutely. The author conveniently leaves out the benefit that websockets enable ditching the frontend js code--included is the library the author is plugging. The backend shouldn't send back an error message to the frontend for rendering, but, instead, a rendered view.

hliyan · 2025-04-12T03:34:16 1744428856

This is how I used to do it over TCP, 20 years ago: each request message has a unique request ID which the server echoes and the client uses to match against a pending request. There is a periodic timer that checks if requests have been pending for longer than a timeout period and fails them with an error bubbled up to the application layer. We even had an incrementing sequence number in each message so that the message stream can resume after a reconnect. This was all done in C++e, and didn't require a large amount of code to implement. I was 25 years old at the time.

What the author and similar web developers consider complex, awkward or difficult gives me pause. The best case scenario is that we've democratized programming to a point where it is no longer limited to people with highly algorithmic/stateful brains. Which would be a good thing. The worst case scenario is that the software engineering discipline has lost something in terms of rigor.

Tabular-Iceberg · 2025-04-12T04:23:37 1744431817

Every web browser already has a built in system for matching requests and responses and checking if requests have been pending too long. There is no need to reinvent the wheel.

The real problem with the software engineering discipline is that we are too easily distracted from solving the actual business problem by pointless architecture astronautics. At best because of boredom associated with most business problems being uninteresting, at worst to maliciously increase billable hours.

motorest · 2025-04-12T05:23:13 1744435393

> The real problem with the software engineering discipline is that we are too easily distracted from solving the actual business problem by pointless architecture astronautics.

There are two pervasive themes in software engineering:

- those who do not understand the problem domain complaining that systems are too complex.

- those who understand the problem domain arguing that the system needs to be refactored to shed crude unmaintainable hacks and further support requirements it doesn't support elegantly.

Your comment is in step 1.

Tabular-Iceberg · 2025-04-13T06:37:38 1744526258

Yes, but we’re just talking about making websites here. Rolling your own HTTP is overkill, you’re not Google.

willtemperley · 2025-04-12T13:37:40 1744465060

There is a huge difference between guaranteeing algorithmic security of an endpoint, e.g. getting authentication correct, and anticipating every security issue that often has nothing to do with developer code. The former is possible, the latter is not. I understand the author here not wishing to deal with the websocket upgrade process - I would be surprised if there aren’t zero-days lurking there somewhere.

prox · 2025-04-12T08:14:25 1744445665

I am beginning to see this increasingly. Apps that make the most basic of mistakes. Some new framework trying to fix something that was already fixed by the previous 3 frameworks. UX designs making no sense or giving errors that used to be solved. From small outfits (that’s fair) to multi billion dollar companies (you should know better) , I feel that rigor is definitely lacking.

crabmusket · 2025-04-12T11:17:28 1744456648

A framework was recently posted here where the author was comparing how great their Rust-to-WASM client side state management could handle tens of thousands of records which would cause the JS version of their code to stack overflow...

...and yes, the stack overflow in the JS version was trivially fixable and then the JS version worked pretty well.

ricardobeat · 2025-04-12T01:02:53 1744419773

That’s basically RPC over WS.

This article conflates a lot of different topics. If your WebSocket connection can be easily replaced with SSE+POST requests, then yeah you don’t need WebSockets. That doesn’t mean there aren’t a ton of very valid use cases (games, anything with real time two-way interactivity).

koakuma-chan · 2025-04-12T01:25:48 1744421148

> games, anything with real time two-way interactivity

No need for WebSockets there as well. Check out WebTransport.

hntrl · 2025-04-12T01:30:07 1744421407

It even has mention as being the spiritual successor to WebSocket for certain cases in mdn docs:

https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_...

almostnormal · 2025-04-12T05:08:45 1744434525

"if your application requires a non-standard custom solution, then you should use the WebTransport API"

That's a pretty convincing use-case. Why use something standard if it can be non-standard custom instead!

misiek08 · 2025-04-12T12:05:26 1744459526

Your projects require holistic and craft solutions. Simple, working ways are the wrong path!

apitman · 2025-04-12T01:36:40 1744421800

WebTransport is great but it's not in safari yet.

motorest · 2025-04-12T05:27:20 1744435640

> No need for WebSockets there as well. Check out WebTransport.

Isn't WebTransport basically WebSockets reimplemented in HTTP/3? What point where you trying to make?

koakuma-chan · 2025-04-12T10:13:57 1744452837

> Isn't WebTransport basically WebSockets reimplemented in HTTP/3?

No.

motorest · 2025-04-12T11:15:48 1744456548

> No.

Thanks for your insight.

It seems you need to urgently reach out to the people working on WebTransport. You seem to know better and their documentation contradicts and refutes your assertion.

https://github.com/w3c/webtransport/blob/main/explainer.md

koakuma-chan · 2025-04-12T11:38:48 1744457928

Where does that document say that WebTransport is just WebSockets over HTTP/3? The only thing in common is that both features provide reliable bi-directional streams, but WebTransport also supports unreliable streams and a bunch of other things. Please read the docs. There is also RFC 9220 Bootstrapping WebSockets with HTTP/3, which is literally WebSockets over HTTP/3.

motorest · 2025-04-13T05:31:27 1744522287

> (...) but WebTransport also supports unreliable streams and a bunch of other things.

If you take some time to learn about WebTransport, you will eventually notice that if you remove HTTP/3 from if you remove each and every single feature that WebTransport touts as changes/improvements over WebSockets.

lttlrck · 2025-04-12T19:03:32 1744484612

He said "basically" which should be interpreted as "roughly"? Then it seems his assert is roughly correct?

koakuma-chan · 2025-04-13T03:09:17 1744513757

Maybe? Isn't WebSockets basically TCP? Roughly? I wrote that WebSockets provide reliable bi-directional streams, but it actually doesn't. It implements message framing. WebTransport also doesn't support "unreliable streams", it's actually called "datagrams". WebTransport doesn't even have to be used over HTTP/3 per the latest spec, so is it basically WebSockets reimplemented in HTTP/3? No.

Horusiath · 2025-04-12T05:58:18 1744437498

Last time I've checked none of the common reverse proxy servers (most importantly nginx) supported WebTransport.

hntrl · 2025-04-12T00:07:35 1744416455

> sending a RequestID to the server so that you get request/response cycles isn't weird nor beyond the pale.

To me the sticking point is what if the "response" message never comes? There's nothing in the websocket protocol that dictates that messages need to be acknowledged. With request/response the client knows how to handle that case natively

> And the websocket browser API is nicer to work with than, say, EventSource.

What in particular would you say?

hombre_fatal · 2025-04-12T00:30:28 1744417828

Yeah, you'd need a lib or roll your own that races the response against a timeout.

Kind of like how you also need to implement app-layer ping/pong over websockets for keepalive even though tcp already sends its own ping/pong. -_-

As for EventSource, I don't remember exactly, something always comes up. That said, you could say the same for websockets since even implementing non-buggy reconn/backaway logic is annoying.

I'll admit, time for me to try the thing you pitch in the article.

throwaway2037 · 2025-04-12T03:26:21 1744428381

I have only small experience programming with web sockets, but I thought the ping pong mechanism is already built into the protocol. Does it have timeout? Does it help at the application layer?

Ref: https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_...

copperroof · 2025-04-12T03:44:35 1744429475

You only need to implement it yourself if you’ve catastrophically fucked up the concurrency model on the client or sever side and they can’t respond out of band of whatever you’re waiting on.

koakuma-chan · 2025-04-12T03:57:05 1744430225

Discord implements its own heartbeat mechanism. I've heard websocket-native ping is somehow unreliable. Maybe in case the websocket connection is fine but something happened at the application layer?

jand · 2025-04-12T08:05:18 1744445118

"Unreliable" is a bit harsh - the problem arises imho not from the websocket ping itself, but from the fact that client-side _and_ server-side need to support the ping/pong frames.

apitman · 2025-04-13T03:39:48 1744515588

The WebSocket browser APIs don't support ping/pong

rodorgas · 2025-04-12T03:09:10 1744427350

Native EventSource doesn’t let you set headers ([issue](https://github.com/whatwg/html/issues/2177)), so it’s harder to handle authentication.

crabmusket · 2025-04-12T11:11:14 1744456274

> sending a RequestID to the server so that you get request/response cycles isn't weird nor beyond the pal

There's even a whole spec for that: JSON-RPC, and it's quite popular.

cryptonector · 2025-04-12T05:27:05 1744435625

IMAP uses request IDs.

osigurdson · 2025-04-12T00:16:29 1744416989

>> If it wasn’t, we couldn’t stream video without loading the entire file first

I don't believe this is correct. To my knowledge, video stream requests chunks by range and is largely client controlled. It isn't a single, long lived http connection.

motorest · 2025-04-12T05:31:43 1744435903

> I don't believe this is correct.

Yes, the statement is patently wrong. There are a few very popular video formats whose main feature is chunking through HTTP, like HTTP Live Streaming or MPEG-DASH.

EE84M3i · 2025-04-12T00:29:13 1744417753

I believe that's standard for Netflix, etc, but is it also true for plain webms and mp4s in a <video> tags? I thought those were downloaded in one request but had enough metadata at the beginning to allow playback to start before the file is completely downloaded.

wewewedxfgdf · 2025-04-12T00:40:35 1744418435

Yes it is true.

Browsers talking to static web servers use HTTP byte ranges requests to get chunks of videos and can use the same mechanism to seek to any point in the file.

Streaming that way is fast and simple. No fancy technology required.

For MP4 to work that we you need to render it as fragmented MP4.

EE84M3i · 2025-04-12T01:48:17 1744422497

Why would the browser send byte range requests for video tags if it expects to play the file back linearly from beginning to end anyway? Wouldn't that be additional overhead/round-trips?

koakuma-chan · 2025-04-12T02:02:48 1744423368

> Why would the browser send byte range requests for video tags if it expects to play the file back linearly from beginning to end anyway?

Probably because byte range is required for seeking, and playing from the beginning is equivalent to seeking at 0.

> Wouldn't that be additional overhead/round-trips?

No because the range of the initial byte range request is the whole file (`bytes=0-`).

EE84M3i · 2025-04-12T02:35:44 1744425344

My original comment was about the commenter I replied to saying:

> To my knowledge, video stream requests chunks by range and is largely client controlled. It isn't a single, long lived http connection.

Wouldn't a byte range request for the whole file fall under the "single, long lived http connection"? Sure it could be terminated early and another request made for seeking, but regardless the video can start before the whole file is downloaded, assuming it's encoded correctly?

koakuma-chan · 2025-04-12T03:09:55 1744427395

> Wouldn't a byte range request for the whole file fall under the "single, long lived http connection"?

Yes, it would (though a better description would be "a single, long lived http request" because this doesn't have anything to do with connections), and wewewedxfgdf also replied Yes.

> Sure it could be terminated early and another request made for seeking, but regardless the video can start before the whole file is downloaded, assuming it's encoded correctly?

Yes.

tedunangst · 2025-04-12T02:51:55 1744426315

The client doesn't want to eat the whole file, so it uses a range request for just the beginning of the file, and then the next part as needed.

koakuma-chan · 2025-04-12T03:14:01 1744427641

The client would actually request the whole file and then terminate the request if the file is no longer needed. This is what browsers do at least.

lxgr · 2025-04-12T03:54:06 1744430046

Both are possible, and in fact I could imagine not all servers being too happy with having to trickle data over a persistent HTTP connection through the entire length of the video, with an almost always full TCP send buffer at the OS level.

koakuma-chan · 2025-04-12T04:04:24 1744430664

> Both are possible

It is possible if you are in control of the client, but no browser would stream an mp4 file request by request.

> with an almost always full TCP send buffer at the OS level

This shouldn't be a problem because there is flow control. Also the data would probably be sent to the kernel in small chunks, not the whole file at once.

lxgr · 2025-04-12T04:17:47 1744431467

> It is possible if you are in control of the client, but no browser would stream an mp4 file request by request.

I believe most browsers do it like that, these days: https://developer.mozilla.org/en-US/docs/Web/Media/Guides/Au...

> This shouldn't be a problem because there is flow control.

It's leveraging flow control, but as I mentioned this might be less efficient (in terms of server memory usage and concurrent open connections, depending on client buffer size and other variables) than downloading larger chunks and closing the HTTP connection in between them.

Many wireless protocols also prefer large, infrequent bursts of transmissions over a constant trickle.

koakuma-chan · 2025-04-12T04:46:48 1744433208

> I believe most browsers do it like that, these days

Nope. Browsers send a byte range request for the whole file (`0-`), and the correspoding time range grows as the file is being downloaded. If the user decided to seek to a different part of the file, say at byte offset 10_000, the browser would send a second byte range request, this time `10000-` and a second time range would be created (if this part of the file has not already been downloaded). So there is no evidence there that any browser would stream files in small chunks, request by request.

> in terms of server memory usage

It's not less efficient in terms of memory usage because the server wouldn't read more data from the filesystem than it can send with respect to the flow control.

> concurrent open connections

Maybe if you're on HTTP/1, but we live in the age of HTTP/2-3.

> Many wireless protocols also prefer large, infrequent bursts of transmissions over a constant trickle.

AFAIK browsers don't throttle download speed, if that's what you mean.

lxgr · 2025-04-12T13:22:02 1744464122

Ah, interesting, I must have mixed it up/looked at range request based HLS playlists in the past. Thank you!

> AFAIK browsers don't throttle download speed, if that's what you mean.

Yeah, I suppose by implementing a relatively large client-application-side buffer and reading from that in larger chunks rather than as small as the media codec allows, the same outcome can be achieved.

Reading e.g. one MP3 frame at a time from the TCP buffer would effectively throttle the download, limited only by Nagle's Algorithm, but that's probably still much too small to be efficient for radios that prefer to sleep most of the time and then receive large bursts of data.

koakuma-chan · 2025-04-12T14:46:18 1744469178

Realistically you wouldn’t be reading anything from the TCP buffer because you would have TLS between your app and TCP, and it’s pretty much guaranteed that whatever TLS you’re using already does buffering.

lxgr · 2025-04-12T17:03:35 1744477415

That's effectively just another small application layer buffer though, isn't it? It might shift what would otherwise be in the TCP receive buffer to the application layer on the receiving end, but that should be about all the impact.

koakuma-chan · 2025-04-12T17:58:59 1744480739

Oh you’re right, I’m just so used to making the TLS argument because there is also the cost of syscalls if you make small reads without buffering, sorry xD

tsimionescu · 2025-04-12T06:14:39 1744438479

Are you sure browsers would try to download an entire, say, 10h video file instead of just some chunks of it?

koakuma-chan · 2025-04-12T10:00:23 1744452023

Common sense tells me there should be some kind of limit, but I don't know what it is, whether it's standardized and whether it exists. I just tested and Firefox _buffered_ (according to the time range) the first 27_000~ seconds, but in the dev tools the request appeared as though still loading. Chrome downloaded the first 10.2 MB (according to dev tools) and stopped (but meanwhile the time range was growing from zero approximately by one second every second, even though the browser already stopped downloading). After it played for a bit, Chrome downloaded 2.6 more MB _using the same request_. In both cases the browser requested the whole file, but not necessarily downloaded the whole file.

jofla_net · 2025-04-12T01:11:11 1744420271

Seconded, ive done a userland 'content-range' implementation myself. of course there were a few ffmpeg specific parameters the mp4 needed to work right still

wordofx · 2025-04-12T05:13:31 1744434811

It’s not true because throwing a video file as a source on video tag has no information about the file being requested until the headers are pushed down. Hell back in 2005 Akamai didn’t even support byte range headers for partial content delivery, which made resuming videos impossible, I believe they pushed out the update across their network in 06 or 07.

jonathanlydall · 2025-04-12T06:28:55 1744439335

If your HTTP server provides and supports the appropriate headers and you’re serving supported file types, then it absolutely is true.

Just putting a url in my Chromium based browser’s address bar to an mp4 file we have hosted on CloudFlare R2 “just works” (I expect a video tag would be the same), supporting skipping ahead in the video without having to download the whole thing.

Initially skipping ahead didn’t work until I disabled caching on CloudFlare CDN as that breaks the “accept-range” capability on videos. For now we have negligible amount of viewership of these mp4s, but if it becomes an issue we’ll use CloudFlare’s video serving product.

wordofx · 2025-04-12T08:01:10 1744444870

> If your HTTP server provides and supports the appropriate headers and you’re serving supported file types, then it absolutely is true.

No. When you play a file in the browser with a video tag. It requests the file. It doesn’t ask for a range. It does use the range if you seek it, or you write the JavaScript to fetch based on a range. That’s why if you press play and pause it buffers the whole video. Only if you write the code yourself can you partially buffer a while like YouTube does.

bmacho · 2025-04-12T08:29:20 1744446560

Nah, it uses complex video specific logic and http range requests as protocol. (At least the normal browsers and servers. You can roll your own dumb client/server of course.)

> That’s why if you press play and pause it buffers the whole video.

Browsers don't do that.

jonathanlydall · 2025-04-12T08:48:30 1744447710

Obviously it doesn’t initially ask for a range if it starts from the beginning of the video, but it starts playing video immediately without requiring the whole file to download, when you seek it cancels the current request and then does a range request. At no point does it “have” to cache the entire file.

I suppose if you watch it from start to finish without seeking it might cache the entire file, but it may alternatively keep a limited amount cached of the video and if you go back to an earlier time it may need to re-request that part.

Your confidence seems very high on something which more than one person has corrected you on now, perhaps you need to reassess the current state of video serving, keeping in mind it does require HTTP servers to allow range requests.

wewewedxfgdf · 2025-04-12T08:01:20 1744444880

You can learn it here:

https://www.zeng.dev/post/2023-http-range-and-play-mp4-in-br...

You can also watch it happen - the Chrome developer tools network tab will show you the traffic that goes to and from the web browser to the server and you can see this process in action.

creatonez · 2025-04-12T07:56:21 1744444581

Who cares what happened in 2005? This is so rare nowadays, I've only really seen it on websites that are constructing the file as they go, such as the Github zip download feature.

maccard · 2025-04-12T08:05:16 1744445116

2005 is basically the dark ages of the web. It’s pre Ajax and ie6 was the dominant browser. Using this as an argument is like saying apps aren’t suitable because the iPhone didn’t have an App Store until 2008.

> It’s not true because throwing a video file as a source on video tag has no information about the file being requested until the headers are pushed down.

And yet, if you stick a web server in front of a video and load it in chrome, you’ll see just that happening.

wordofx · 2025-04-12T08:29:40 1744446580

Can load a video into a video tag in chrome. Press play and pause. See it makes a single request and buffers the whole video.

maccard · 2025-04-12T20:07:21 1744488441

If you stick:

  <video controls>
    <source src="/video/sample.mp4" type="video/mp4">
    Your browser does not support the video tag.
  </video>

into a html file, and run it against this pastebin [0], you'll see that chrome (and safari) both do range requests out of the box if the fileis big enough.

[0] https://pastebin.com/MyUfiwYE

wordofx · 2025-04-12T22:32:11 1744497131

Tried it on a 800mb file. Single request.

maccard · 2025-04-13T08:58:15 1744534695

I tried it on 4 different files, and in each case my browser sent a request, my server responded with a 206 and it grabbed chunks as it went.

AnotherGoodName · 2025-04-12T01:23:27 1744421007

They can playback as loading as long as they are encoded correctly fwiw (faststart encoded).

When you create a video from a device the header is actually at the end of the file. Understandable, it’s where the file pointer was and mp4 allows this so your recording device writes it at the end. You must re-encoded with faststart (puts the moov atom at the start) to make it load reasonably on a webpage though.

timewizard · 2025-04-12T01:48:32 1744422512

> Understandable, it’s where the file pointer was and mp4 allows this so your recording device writes it at the end.

Yet formats like WAVE which use a similar "chunked" encoding they just use a fixed length header and use a single seek() to get back to it when finalizing the file. Quicktime and WAVE were released around nearly the same time in the early 90s.

MP2 was so much better I cringe every time I have to deal with MP4 in some context.

lxgr · 2025-04-12T04:00:13 1744430413

At the expense of quite some overhead though, right?

MPEG-2 transport streams seem more optimized for a broadcast context, with their small frame structure and everything – as far as I know, framing overhead is at least 2%, and is arguably not needed when delivered over a reliable unicast pipe such as TCP.

Still, being able to essentially chop a single, progressively written MPEG TS file into various chunks via HTTP range requests or very simple file copy operations without having to do more than count bytes, and with self-synchronization if things go wrong, is undoubtedly nicer to work with than MP4 objects. I suppose that's why HLS started out with transport streams and only gained fMP4 support later on.

timewizard · 2025-04-12T04:49:33 1744433373

> and is arguably not needed when delivered over a reliable unicast pipe such as TCP.

So much content ended up being delivered this way, but there was a brief moment where we thought multicast UDP would be much more prevalent than it ended up being. In that context it's perfect.

> why HLS started out with transport streams and only gained fMP4 support later on.

Which I actually think was the motivation to add fMP4 to base MP4 in the first place. In any case I think MPEG also did a better job with DASH technically but borked it all up with patents. They were really stupid with that in the early 2010s.

immibis · 2025-04-12T08:30:08 1744446608

Multicast UDP is widely used - but not on the Internet.

We often forget there are networks other than the Internet. Understandable, since the Internet is most open. The Internet is just an overlay network over ISPs' private networks.

SCTP is used in cellphone networks and the interface between them and legacy POTS networks. And multicast UDP is used to stream TV and/or radio throughout a network or building. If you have a "cable TV" box that plugs into your fiber internet connection, it's probably receiving multicast UDP. The TV/internet company has end-to-end control of this network, so they use QoS to make sure these packets never get dropped. There was a write-up posted on Hacker News once about someone at a hotel discovering a multicast UDP stream of the elevator music.

lxgr · 2025-04-12T13:30:48 1744464648

> If you have a "cable TV" box that plugs into your fiber internet connection, it's probably receiving multicast UDP.

That's a good point: I suppose it's a big advantage being able to serve the same, unmodified MPEG transport stream from a CDN, as IP multicast over DOCSIS/GPON, and as DVB-C (although I’m not sure that works like that, as DVB usually has multiple programs per transponder/transport stream).

jofzar · 2025-04-12T01:06:14 1744419974

The long answer is "it depends on how you do it" unsurprisingly video and voice/audio are probably the most different ways that you can "choose" to do distribution

bhhaskin · 2025-04-12T01:39:00 1744421940

This. You can't just throw it into a folder and have to stream. The web server has to support it and then there is encoding and formats.

bob1029 · 2025-04-12T08:40:22 1744447222

Yea this works for mp4 and HN seems confused about how.

The MOOV atom is how range requests are enabled, but the browser has to find it first. That's why it looks like it's going to download the whole file at first. It doesn't know the offset. Once it reads it, the request will be cancelled and targeted range requests will begin.

lxgr · 2025-04-12T03:52:20 1744429940

The two are essentially the same thing, modulo trading off some unnecessary buffering on both sides of the TCP pipe in the "one big download" streaming model for more TCP connection establishments in the "range request to refill the buffer" one.

apitman · 2025-04-12T01:32:49 1744421569

For MP4s the metadata is at the end annoyingly enough.

AnotherGoodName · 2025-04-12T01:45:45 1744422345

MP4 allows the header at the start or the end.

It’s usually written to the end since it’s its not a fixed size and it’s a pain for recording and processing tools to rewrite the whole file on completion just to move the header to the start. You should always re-encode to move the header to the start for web though.

It’s something you see too much of online once you know about it but mp4 can absolutely have the header at the start.

koakuma-chan · 2025-04-12T01:46:36 1744422396

You can `-movflags faststart` when encoding to place it at the beginning.

tomsonj · 2025-04-12T01:50:18 1744422618

implementations may request the metadata range at the end in this case, if the content length is known

lxgr · 2025-04-12T03:35:42 1744428942

For "VOD", that works (and is how very simple <video> tag based players sometimes still do it), but for live streaming, it wouldn't – hence the need for fragmented MP4, MPEG-DASH, HLS etc.

It does work for simpler codecs/containers though: Shoutcast/Icecast web radio streams are essentially just endless MP3 downloads, optionally with some non-MP3 metadata interspersed at known intervals.

ejoso · 2025-04-12T03:03:38 1744427018

Correct. HLS and Dash are industry standards. Essentially the client downloads a file which lists the files in various bitrates and chunks and the client determines which is best for the given connectivity.

Mogzol · 2025-04-12T03:17:22 1744427842

And even if you are using a "regular" video format like mp4, browsers will still use range requests [1] to fetch chunks of the file in separate requests, assuming the server supports it (which most do).

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Ran...

dangoodmanUT · 2025-04-12T00:18:42 1744417122

Correct

notpushkin · 2025-04-11T23:34:07 1744414447

> Bonus: Making it easy with eventkit

Why not just use SSE? https://developer.mozilla.org/en-US/docs/Web/API/Server-sent...

hntrl · 2025-04-11T23:53:13 1744415593

I've noticed some weird behaviors with the EventSource impl that browsers ship with. Chief among them being the default behavior is to infinitely reconnect after the server closes the stream, so you have to coordinate some kind of special stop event to stop the client from reconnecting. You wouldn't have that problem with the stream object from Response.body

The SSE protocol is actually just a long-running stream like I mentioned but with specific formatting for each chunk (id, event, and data fields)

as a side note, eventkit actually exports utilities to support SSE both on client and server. The reason you'd want to use eventkit in either case is because it ships with some extra transformation and observability goodies. https://hntrl.github.io/eventkit/guide/examples/http-streami...

extheat · 2025-04-12T03:07:11 1744427231

The reconnect thing is actually quite helpful for mobile use cases. Say the user switches the tab, closes their browser or loses network and then they return. Since SSE is stateless from the client's perspective, the client can just reconnect and continue receiving messages. Whereas with WS there's handshakes to worry about--and also other annoyances like what to do with pending requests before connection was lose.

tbeseda · 2025-04-12T00:48:11 1744418891

SSE is great. Most things with websockets would be fine with SSE.

Also I don't see it being much easier here than a few primitives and learning about generator functions if you haven't had experience with them. I appreciate the helper, but the API is pretty reasonable as-is IMO

notpushkin · 2025-04-12T03:52:40 1744429960

I’m experimenting with SSE for realtime project deployment logs in https://lunni.dev/ and it’s been extremely pleasant so far.

The only problem is, if you want to customize the request (e.g. send a POST or add a header),you have to use a third-party implementation (e.g. one from Microsoft [1]), but I hope this can be fixed in the standards later.

[1]: https://www.npmjs.com/package/@microsoft/fetch-event-source

hntrl · 2025-04-12T02:37:08 1744425428

The helper example was a sore attempt to plug the project I've been working on (tinkering with it is how I came up with the example). The library I plugged has much more to do with enabling a more flexible reactive programming model in js, but just so happens to plug into the stream API pretty handily. Still an interesting look IMO if you're into that kind of stuff

notpushkin · 2025-04-12T04:05:58 1744430758

No worries, I know how it feels! (I said, plugging away my own project in a sibling comment, lol)

I do like the reactive approach (in fact, I’ve reinvented something similar over SSE). I feel a standards-based solution is just ever so slightly more robust/universal.

apitman · 2025-04-12T01:40:30 1744422030

SSE doesn't support binary data without encoding to something base64 first. These days I'd recommend a fetch stream with TLV messages first, followed by WebSocket.

osigurdson · 2025-04-12T00:17:25 1744417045

Based on my read, this basically is SSE but doesn't use the same protocol.

supahfly_remix · 2025-04-11T23:43:27 1744415007

Do CDN, such as Cloudflare, support SSE? The last time I looked, they didn't, but maybe things have changed.

nyrikki · 2025-04-12T00:25:50 1744417550

Cloudflare doesn't officially support SSE, but if you send keepalives events every 15 or 20sec or so you can reliably use SSE for 40 min + in my experiance.

No server traffic for 100+ sec officially results in a 524, so you could possibly make that keepalive interval longer, but I haven't tested it.

Make sure to have the new style cache rule with Bypass cache selected and absolutely make sure you are using HTTP/2 all the way to the origin.

The 6 connections per browser limit of HTTP/1.1 SSE was painful, and I am pretty sure auto negotiation breaks, often in unexpected ways with a HTTP/1.1 origin.

hntrl · 2025-04-11T23:57:50 1744415870

I have a demo of this for CF workers https://github.com/hntrl/eventkit/tree/main/examples/workers...

(it's not SSE in particular, but it demonstrates that you can have a long running stream like SSE)

threatofrain · 2025-04-12T01:01:27 1744419687

On top of the comments below about SSE, I'd also point out that Cloudflare is doing some interesting stuff around serverless resumable websockets. They also have stuff for WebRTC.

bastawhiz · 2025-04-11T23:51:56 1744415516

Yes, they are

kordlessagain · 2025-04-11T23:43:07 1744414987

SSE is the way to roll.

gorjusborg · 2025-04-12T02:31:28 1744425088

The problem selects the solution.

That said, I like SSE for unidirectional string-encoded events.

jongjong · 2025-04-12T00:16:09 1744416969

I don't know why people keep trying desperately to avoid the simplicity and flexibility of WebSockets.

A lot of times, what people need is a bidirectional connection yet somehow they convince themselves that SSE is better for the job... But they end up with two different types of streams; HTTP for writes and responses and SSE for passively consuming real-time data... Two different stream types with different lifecycles; one connection could fail while the other is fine... There is no way to correctly identify what is the current connection status of the app because there are multiple connections/statuses and data comes from multiple streams... Figuring out how to merge data coming from HTTP responses with data coming in passively from the SSE is messy and you have no control over the order in which the events are triggered across two different connections...

You can't enforce a serial, sequential, ordered flow of data over multiple connections as easily, it gets messy.

With WebSockets, you can easily assign an ID to requests and match it with a response. There are plenty of WebSocket frameworks which allow you to process messages in-order. The reason they work and are simple is because all messages pass over a single connection with a single state. Recovering from lost connections is much more straight forward.

gorjusborg · 2025-04-12T02:34:09 1744425249

I don't know why everyone... proceeds to use their own experience as proof of what everyone needs.

These are tools, not religions.

Websockets have some real downsides if you don't need bidirectional comms.

hntrl · 2025-04-12T00:38:10 1744418290

who's to say your data is coming from multiple streams? You can propagate any updates you need to make in the application to a single stream (like SSE or a long-lived response) in place of a WebSocket. your http responses can just be always 204 if all they're doing is handling updates and pushing events to aforementioned single stream.

https://en.wikipedia.org/wiki/Command%E2%80%93query_separati...

koakuma-chan · 2025-04-12T01:02:30 1744419750

SSE does not require a separate connection, unlike WebSockets.

kaoD · 2025-04-12T02:04:01 1744423441

MDN disagrees. See the huge red warning here https://developer.mozilla.org/en-US/docs/Web/API/EventSource

Unless you mean on HTTP2? But aren't WS connections also multiplexed over HTTP2 in that case?

koakuma-chan · 2025-04-12T02:12:16 1744423936

> MDN disagrees. See the huge red warning here https://developer.mozilla.org/en-US/docs/Web/API/EventSource

It should say "When used over HTTP/1" instead of "When not used over HTTP/2" because nowadays we also have HTTP/3, and browsers barely even use HTTP/1, so I would say it's pretty safe to ignore that warning.

> Unless you mean on HTTP2?

Any version of HTTP that supports multiplexing.

> But aren't WS connections also multiplexed over HTTP2 in that case?

There is RFC 8441 but I don't think it's actually implemented in the browsers.

kaoD · 2025-04-12T02:38:38 1744425518

> There is RFC 8441 but I don't think it's actually implemented in the browsers.

Found this: https://github.com/mattermost/mattermost/issues/30285

koakuma-chan · 2025-04-12T03:32:39 1744428759

https://chromestatus.com/feature/6251293127475200

It looks like it's supported in Chrome and Firefox but not in Safari.

shadowangel · 2025-04-12T03:11:04 1744427464

It's javascript, anything simple needs a framework.

socketcluster · 2025-04-12T00:44:53 1744418693

The problem with HTTP2 is that the server-push aspect was tacked on top of an existing protocol as an afterthought. Also, because HTTP is a resource transfer protocol, it adds a whole bunch of overheads like request and response headings which aren't always necessary but add to processing time. The primary purpose of HTTP2 was to allow servers to preemptively push files/resources to clients to avoid round-trip latency; to reduce the reliance on script bundles.

WebSockets is a simpler protocol built from the ground up for bidirectional communication. It provides a lot more control over the flow of data as everything passes over a single connection which has a single lifecycle. It makes it a lot easier to manage state and to recover cleanly from a lost connection when you only have one logical connection. It makes it easier to process messages in a specific order and to do serial processing of messages. Having just one connection also greatly simplifies things in terms of authentication and access control.

I considered the possibility of switching the transport to HTTP2 for https://socketcluster.io/ years ago, but it's a fundamentally more complex protocol which adds unnecessary overheads and introduces new security challenges so it wasn't worth it.

aseipp · 2025-04-13T01:15:57 1744506957

> The primary purpose of HTTP2 was to allow servers to preemptively push files/resources to clients to avoid round-trip latency; to reduce the reliance on script bundles.

No, it was not. The primary goal of HTTP/2 was to get over traditional connection limits through connection multiplexing because browsers treat TCP connections as an extremely scarce resource. Multiplexing massively improves the ability to issue many asynchronous calls, which are very common -- and H2 went on to make the traditional HTTP stack more efficient across the board (i.e. header compression.) Some of the original HTTP/2 demo sites that popped up after Google first supported it in Chrome were of loading many images over HTTP/1 vs HTTP/2, which is very common. In one case of my own (fetching lots of small < 1kb files recursively from S3, outside the browser) HTTP/2 was like a 100x performance boost over HTTP/1 or something.

You're correct Server Push was tacked on and known to be flawed very early on, and it took a while before everyone pulled the plug on it, but people fixated on it because it just seemed really cool, from what I can tell. But it was never the lynchpin of the thing, just a (failed and experimental) boondoggle.

alt227 · 2025-04-12T17:59:11 1744480751

> The primary purpose of HTTP2 was to allow servers to preemptively push files/resources to clients to avoid round-trip latency; to reduce the reliance on script bundles.

The primary purpose for HTTP2 was to allow multiple simultaneous asynchoronous http calls, which is a massive loading performance boost for most websites. Server push was very much a tacked on afterthought.

koakuma-chan · 2025-04-12T01:12:28 1744420348

How can server push be a problem with HTTP/2 if nobody supports server push? It's dead. And what about multiplexing and header compression? Not worth it?

tsimionescu · 2025-04-12T06:23:32 1744439012

Server push is dead though, SSE is a different idea with completely different semantics (and tradeoffs).

mountainriver · 2025-04-12T01:49:00 1744422540

Agree after banging my head against http2 for years, I now really enjoy how simple websockets are and their universal support

suzzer99 · 2025-04-12T03:55:31 1744430131

Me: For this POC you've given me, I will do an old-fashioned HTTP form submit, no need for anything else.

Architect: But it must have websockets!

Me: Literally nothing in this POC needs XHR, much less websockets. It's a sequential buy flow with nothing else going on.

Architect: But it has to have websockets, I put them on the slide!

(Ok he didn't say the part about putting it on the slide, but it was pretty obvious that's what happened. Ultimately I caved of course and gave him completely unnecessary websockets.)

kigiri · 2025-04-12T13:15:18 1744463718

My strategy for this kind of situation is to avoid direct rejection. Instead of saying stuff like "it's unnescessary" or "you are wrong", I push for trying first without.

I would say:

> Once we have a working MVP without websockets we can talk again to think about using websocket.

Most times, once something is working, they then stop to care, or we have other priorities then.

ticoombs · 2025-04-12T04:27:21 1744432041

I always try and push back on those beliefs, about reasonings why they believe it will be faster or more efficient than some other solution.

I've found , if you could type cast those people, they would be a tech architect who only uses "web scale" items. (Relevant link: https://www.youtube.com/watch?v=5GpOfwbFRcs )

suzzer99 · 2025-04-12T16:28:44 1744475324

I call them Powerpoint architects.

Voultapher · 2025-04-12T07:31:38 1744443098

Having deployed WebSockets into production, I came to regret that over the next years. Be it ngnix terminating connections after 4/8 hours, browsers not reconnecting after sleep and other issues, I am of the opinion that WebSockets and other forms of long standing connections should be avoided if possible.

bonestamp2 · 2025-04-12T09:21:01 1744449661

Not to mention, some major parts of the websocket API have been broken in Google Chrome for over two years now.

Chrome no longer fires Close or Error events when a websocket disconnects (well, at least not when they happen, they get fired about 10 minutes later!). So, your application won't know for 10 minutes that the connection has been severed (unless the internet connection is also lost, but that isn't always the case when a websocket is disconnected).

Here's the chrome bug:

https://issuetracker.google.com/issues/362210027?pli=1

From that bug report it looks like the Chrome bug is less than a year old, but the Chrome bug is originally mentioned here in April 2023 for a similar bug in iOS (the iOS bug has been resolved):

https://stackoverflow.com/questions/75869629/ios-websocket-c...

I kind of suspect Chrome is actually doing this intentionally. I believe they do this so a tab can recover from background sleep without firing a websocket close event. That's helpful in some cases, but it's a disaster in other cases, and it doesn't matter either way... it breaks the specification for how websockets are expected to work. WebSockets should always fire Close and Error events immediately when they occur.

Sammi · 2025-04-13T13:06:28 1744549588

If you want to use websockets, then you are most definitely going to need some library that wraps the websocket, because websockets themselves are very simple and don't do things like reconnect on their own.

This one is pretty simple and pretty great: https://github.com/lukeed/sockette

I did my own which provides rpc functionality and type safety: https://github.com/samal-rasmussen/smolrpc

dontlaugh · 2025-04-13T18:47:10 1744570030

Even load balancers force you to have a frequent heartbeat all the way to the client for each connection.

toomim · 2025-04-12T02:31:40 1744425100

People interested in HTTP streaming should check out Braid-HTTP: https://braid.org. It adds a standard set of semantics that elegantly extend HTTP with event streaming into a robust state synchronization protocol.

collingreen · 2025-04-12T01:44:38 1744422278

Oof, what a headline to be top of hn the day after you implement websockets into a project.

bonestamp2 · 2025-04-12T09:23:41 1744449821

We've had a production app with them for over 10 years and it's generally great. The only thing to be aware of is this Chrome bug:

https://issuetracker.google.com/issues/362210027?pli=1

You can add a recurring ping/pong between the client/server so you can know with some recency that the connection has been lost. You shouldn't have to do that, but you probably want to until this bug is fixed.

philipwhiuk · 2025-04-12T16:59:28 1744477168

60s heartbeat interval, job done.

We've got multiple internal apps using WebSockets in production, for years. I have to say I don't really get all the concern in the article about upgrading the connection - any decent backend framework should handle this for you without a problem.

Hacker News articles on new libraries generally live in the 1% of the 1%. For lots of websites, they don't need a web-socket because they are just doing CRUD. For the 1% doing live updates, web-sockets are great and straight-forward. For whatever specialised use case the article has, sure there's something even less well supported you can pivot to.

bonestamp2 · 2025-04-13T00:09:58 1744502998

We use a much short heartbeat interval because our use case is for real time control and monitoring so our users need to know immediately if the connection is lost.

sampullman · 2025-04-12T02:43:26 1744425806

Websockets work great, don't worry too much about it.

jFriedensreich · 2025-04-12T14:43:48 1744469028

I don't know why the topic of websockets is so weird. 80% of the industry seem to have this skewed idealised perception of websockets as the next frontier of their web development career and cannot wait to use them for anything remotely connected to streaming/ realtime use cases. When pointing out the nuances and that websockets should actually be avoided for anything where they are not absolutely needed without alternatives people get defensive and offended, killing every healthy discussion about realistic tradeoffs for a solution. Websockets have a huge number of downsides especially losing many of the niceties and simplicity of http tooling, reasonability, knowledge and operations of http. As many here pointed, the goto solution for streaming server changes is h2 / h3 and SSE. Everything that can be accomplished in the other direction with batching and landing in the ballpark of max 0.5req/s per client does NOT need websockets.

austin-cheney · 2025-04-12T17:28:02 1744478882

There is no reason to avoid WebSockets. This is a conclusion people come to because they are familiar with HTTP round trips and cannot imagine anything different.

There are no nuances to understand. It’s as simple as fire and forget.

The only downside to WebSockets is that they are session oriented. Conversely, compared to WebSockets the only upside to HTTP is that its sessionless.

0xbadcafebee · 2025-04-12T05:11:00 1744434660

I just realized that modern web applications are a group form of procrastination. Procrastination is a complex thing. But essentially, it's putting something off because of some perceived pain, even though the thing may be important or even inevitable, and eventually the procrastination leads to negative outcomes.

Web applications were created because people were averse to creating native applications, for fear of the pain involved with creating and distributing native applications. They were so averse to this perceived pain that they've done incredibly complex, even bizarre things, just so they don't have to leave the web browser. WebSockets are one of those things: taking a stateless client-server protocol (HTTP) and literally forcing it to turn into an entirely new protocol (WebSockets) just so people could continue to do things in a web browser that would have been easy in a native application (bidirectional stateful sockets, aka a tcp connection).

I suppose this is a normal human thing. Like how we created cars to essentially have a horseless buggy. Then we created paved roads to make that work easier. Then we built cities around paved roads to keep using the cars. Then we built air-scrubbers into the cars and changed the fuel formula when we realized we were poisoning everyone. Then we built electric cars (again!) to try to keep using the cars without all the internal combustion issues. Then we built self-driving cars because it would be easier than expanding regional or national public transportation.

We keep doing the easy thing, to avoid the thing we know we should be doing. And avoiding it just becomes a bigger pain in the ass.

bonestamp2 · 2025-04-12T09:08:35 1744448915

I agree with a lot of that. But, it's a lot easier to get someone to try your web app than install a native app. It's also easier to get the IT department to allow an enterprise web app than install a native app. Web apps do have some advantages over native apps.

0xbadcafebee · 2025-04-13T08:08:32 1744531712

Yes, all of that is the reason why we procrastinate. "Easy" is the excuse we give ourselves to do the things we would otherwise have no justification for, and avoid the difficult things we know would be better. It's not my fault; it's not my responsibility; I shouldn't have to do extra work; it's too complicated; it'll be hard; it'll be long; it'll be painful; it's not perfect; it might fail. No worries; there's something easier I can do.

Thus we see the flaws in the world, and shrug. When someone else does this, we get angry, and indignant. How dare someone leave things like this! Yet when we do it, we don't make a peep.

crabmusket · 2025-04-12T11:26:39 1744457199

You left out the part where you explain why native apps are so much better for users and developers than web apps?

I can't tell why you think WebSockets are so bizarre.

koakuma-chan · 2025-04-12T11:55:00 1744458900

Many advantages, for example web apps get suspended if you’re not browsing the tab. But I do agree it’s much more attractive to write web apps mainly for portability.

misiek08 · 2025-04-12T12:19:48 1744460388

Native apps also are suspended. Or can be, like na iOS (not being fanboy, I appreciate this mechanism). Also native, desktop apps also can be almost suspended while not used.

Web apps are just way easier to do anything (rarely good), so many people are doing them without real engineering or algo knowledge producing trash every day. Article is also using same voice. Showing one protocol as completely bad, mentioning only the issues both approaches have, but silently omitting those issues describing „the only way, craft, holistic, Rust and WASM based solution, without a plug”

koakuma-chan · 2025-04-12T12:40:43 1744461643

> Native apps also are suspended. Or can be, like na iOS (not being fanboy, I appreciate this mechanism).

On iOS web apps get suspended very aggressively, and there is no way for a web app to signal to the browser to not suspend it. I never developed native mobile apps, but I assume it’s less aggressive for native apps and/or native apps have a way to prevent themselves from being suspended. This doesn’t seem to be an issue on desktop though.

flomo · 2025-04-12T17:01:06 1744477266

> bidirectional stateful sockets, aka a tcp connection

Which is not "easy" to do over the internet, so the native app folks ended-up using HTTP anyway. (Plus they invented things like SOAP.)

0xbadcafebee · 2025-04-13T08:26:27 1744532787

TCP is easy to do over the internet. Did you mean the middleboxes? Ah, the middleboxes, the favorite scapegoat of the world wide web's cabal of committees. You'd think they were absolutely powerless. Like firewalls and application proxies are a fundamental principle of nature; unable to be wrestled, only suffered under. Yet the web controls a market share 500 times larger.

raluk · 2025-04-12T21:00:11 1744491611

Websocket is not ment to be sent as streams (TCP equvalent), but as datagrams aka packets (UDP equivalents). Correct me if I am wrong, but websockets api in Javascript libraray for browsers is pretty poor and does not have ability to handle backpressure and I am sure it can not handle all possible errors (assertions about delivery). If you want to use websockets as TCP streams including seasson handling, great care should be taken as this is not natively availabe in neither rfc6455 and in browser.

il-b · 2025-04-12T07:44:39 1744443879

I usually start with the long polling/SSE and migrate to WebSockets when needed. It is cheap and reliable with almost no performance overhead when compared to WebSockets.

andersmurphy · 2025-04-12T06:38:35 1744439915

You don't need websockets SSE works fine for realtime collaborative apps.

Websockets sound great on paper. But, operationally they are a nightmare. I have had the misfortune of having to use them at scale (the author of Datastar had a similar experience). To list some of the challenges:

- firewalls and proxies, blocked ports

- unlimited connections non multiplexed (so bugs lead to ddos)

- load balancing nightmare

- no compression.

- no automatic handling of disconnect/reconnect.

- no cross site hijacking protection

- Worse tooling (you can inspect SSE in the browser).

- Nukes mobile battery because it hammers the duplex antenna.

You can fix some of these problems with websockets, but these fixes mostly boil down to sending more data... to send more data... to get you back to your own implementation of HTTP.

SSE on the other hand, by virtue of being regular HTTP, work out of the box with, headers, multiplexing, compression, disconnect/reconnect handling, h2/h3, etc.

If SSE is not performant enough for you then you should probably be rolling your own protocol on UDP rather than using websockets. Or wait until WebTransport is supported in Safari (any day now ).

Here's the article with a real time multiplayer Game of Life that's using SSE and compression for multiplayer.

https://example.andersmurphy.com

It's doing a lot of other dumb stuff explained a bit more here, but the point is you really really don't need websockets (and operationally you really don't want them):

https://andersmurphy.com/2025/04/07/clojure-realtime-collabo...

EarthLaunch · 2025-04-12T07:13:45 1744442025

Useful take, thanks for mentioning specifics. Some of these I wasn't aware of.

- What makes load balancing easier with SSE? I imagine that balancing reconnects would work similar to WS.

- Compression might be a disadvantage for binary data, which WS specializes in.

- Browser inspection of SSE does sound amazing.

- Mobile duplex antenna is way outside my wheelhouse, sounds interesting.

Can you see any situation in which websockets would be advantageous? I know that SSE has some gotchas itself, such as limited connections (6) per browser. I also wonder about the nature of memory and CPU usage for serving many clients on WS vs SSE.

I have a browser game (few players) using vanilla WS.

andersmurphy · 2025-04-12T08:03:49 1744445029

Thanks.

- Load balancing is easier because your connection is stateless. You don't have to connect to the same server when you reconnect. Your up traffic doesn't have to go to the same server as your down traffic. Websocket tend to come with a lot of connection context. With SSE you can easily kill nodes, and clients will reconnect to other nodes automatically.

- The compression is entirely optional. So when you don't need it don't use it. What's great about it though is it's built into the browser so you're not having to ship it to the client first.

- The connection limit of 6 is only applies to http1.1 not http2/3. If you are using SSE you'll want http2/3. But, generally you want http2/3 from your proxy/server to the browser anyway as it has a lot of performance/latency benefits (you'll want it for multiplexing your connection anyway).

- In my experience CPU/memory usage is lower than websockets. Obviously, some languages make them more ergonomic to use virtual/green threads (go, java, clojure). But, a decent async implementation can scale well too.

Honestly, and this is just an opinion, no I can't see when I would ever want to use websockets. Their reconnect mechanisms are just not reliable enough and their operational complexity isn't worth it. For me at least it's SSE or a proper gaming net code protocol over UDP. If your browser game works with websockets it will work with SSE.

EarthLaunch · 2025-04-12T08:45:09 1744447509

I appreciate the answers. For others reading, I also just ran across another thread where you posted relevant info [0]. In the case of my game, I'm going to consider SSE, since most of the communication is server to client. That said, I already have reconnects etc implemented.

In my research I recall some potential tradeoffs with SSE [1], but even there I concluded they were minor enough to consider SSE vs WS a wash[2] even for my uses. Looking back at my bookmarks, I see that you were present in the threads I was reading, how cool. A couple WS advantages I am now recalling:

SSE is one-way, so for situations with lots of client-sent data, a second connection will have to be opened (with overhead). I think this came up for me since if a player is sending many events per second, you end up needing WS. I guess you're saying to use UDP, which makes sense, but has its own downsides (firewalls, WebRTC, WebTransport not ready).

Compression in SSE would be negotiated during the initial connection, I have to assume, so it wouldn't be possible to switch modes or mix in pre-compressed binary data without reconnecting or base64-ing binary. (My game sends a mix of custom binary data, JSON, and gzipped data which the browser can decompress natively.)

Edit: Another thing I'm remembering now is order of events. Because WS is a single connection and data stream, it avoids network related race conditions; data is sent and received in the programmatically defined sequence.

0: https://news.ycombinator.com/item?id=43657717

1: https://rxdb.info/articles/websockets-sse-polling-webrtc-web...

2: https://www.timeplus.com/post/websocket-vs-sse

andersmurphy · 2025-04-12T10:15:06 1744452906

Cool. I didn't notice either. :)

With http2/3 the it's all multiplexed over the same connection, and as far as your server is concerned that up request/connection is very short lived.

Yeah mixed formats for compression is probably a use case (like you said once you commit with compression with SSE there's no switching during the connection). But, then you still need to configure compression yourself with websockets. The main compression advantage of SSE is it's not per message it's for the whole stream. The implementations of compression with websockets I've seen have mostly been per message compression which is much less of a win (I'd get around 6:1, maybe 10:1 with the game example not 200:1, and pay a much higher server/client CPU cost).

Websockets have similar issues with firewalls and TCP. So in my mind if I'm already dealing with that I might as well go UDP.

As for ordering, that's part of the problem that makes websockets messy (with reconnects etc). I prefer to build resilience into the system, so in the case of that demo I shared, if you disconnect/reconnect lose your connection you automatically get the latest view (there's no play back of events that needs to happen). SSE will automatically send up the last received event id up on reconnect (so you can play back missed events if you want, not my thing personally). I mainly use event ID as a hash of content, if the hash is the same don't send any data the client already has the latest state.

By design, the way I build things with CQRS. Up events never have to be ordered with down events. Think about a game loop, my down events are basically a render loop. They just return the latest state of the view.

If you want to order up events (rarely necessary). I can batch on the client to preserver order. I can use client time stamp/hash of the last event (if you want to get fancy), and the server orders and batches those events in sync with the loop, i.e everything you got in the last X time (like blockchains/trading systems). This is only for per client based ordering, no distributed client ordering otherwise you get into lamport clocks etc.

I've been burnt too many times by thinking websockets will solve the network/race conditions for me (and then failing spectacularly), so I'd rather build the system to handle disconnects rather than rely on ordering guarantees that sometimes break.

Again, though my experience has made me biased. This is just my take.

realharo · 2025-04-12T13:06:11 1744463171

What do you mean by "inspect in browser"? All major browsers' devtools have supported WebSocket inspecting for many years.

Many of the other issues mentioned are also trivial to solve (reconnects, cross-origin protection).

Also, doesn't WebTransport have many of the same issues? (e.g. with proxies and firewalls). And do you have any data for the mobile battery claim? (assuming this is for an application in foreground with the screen on)

andersmurphy · 2025-04-12T21:23:32 1744493012

The fact that you are saying they are trivial to solve means you probably need more visibility on your system. Reliable reconnect was the nightmare we saw regularly.

Unfortunately, I can't go into much detail on the mobile battery stuff, but I can give you some hints. If you do some reading on how antenna on phones work combined with websockets heartbeat ping/pong and you should get the idea.

Chaosvex · 2025-04-13T01:17:23 1744507043

> If you do some reading on how antenna on phones work combined with websockets heartbeat ping/pong and you should get the idea.

The implication is that the ping/pong keeps the system active when it wouldn't otherwise be necessary, but else are you receiving data or detecting lost connection with the other mechanisms? The lower layers have their own keepalives, so what's different?

I looked into it a little since it didn't make sense to me, unless you're comparing apples and oranges, but the only research I could find either didn't seem to support your stance or compared WebSockets to the alternative of just simply not being able to receive data in a timely manner.

efortis · 2025-04-12T02:37:05 1744425425

You can also use long polling, which keeps alive a connection so the server can respond immediately when there’s new data. For example:

Server

  const LONG_POLL_SERVER_TIMEOUT = 8_000

  function longPollHandler(req, response) {
    // e.g. client can be out of sync if the browser tab was hidden while a new event was triggered
    const clientIsOutOfSync = parseInt(req.headers.last_received_event, 10) !== myEvents.count
    if (clientIsOutOfSync) {
      sendJSON(response, myEvents.count)
      return
    }

    function onMyEvent() {
      myEvents.unsubscribe(onMyEvent)
      sendJSON(response, myEvents.count)
    }
    response.setTimeout(LONG_POLL_SERVER_TIMEOUT, onMyEvent)
    req.on('error', () => {
      myEvents.unsubscribe(onMyEvent)
      response.destroy()
    })
    myEvents.subscribe(onMyEvent)
  }

Client (polls when tab is visible)

  pollMyEvents()
  document.addEventListener('visibilitychange', () => {
    if (!document.hidden)
      pollMyEvents()
  })

  pollMyEvents.isPolling = false
  pollMyEvents.oldCount = 0
  async function pollMyEvents() {
    if (pollMyEvents.isPolling || document.hidden)
      return
    try {
      pollMyEvents.isPolling = true
      const response = await fetch('/api/my-events', {
        signal: AbortSignal.timeout(LONG_POLL_SERVER_TIMEOUT + 1000),
        headers: { last_received_event: pollMyEvents.oldCount }
      })
      if (response.ok) {
        const nMyEvents = await response.json()
        if (pollMyEvents.oldCount !== nMyEvents) { // because it could be < or >
          pollMyEvents.oldCount = nMyEvents
          setUIState('eventsCount', nMyEvents)
        }
        pollMyEvents.isPolling = false
        pollMyEvents()
      }
      else
        throw response.status
    }
    catch (_) {
      pollMyEvents.isPolling = false
      setTimeout(pollMyEvents, 5000)
    }
  }

Working example at Mockaton: https://github.com/ericfortis/mockaton/blob/6b7f8eb5fe9d3baf...

hattmall · 2025-04-12T14:14:16 1744467256

Yep, have used long polling with no downsides for ~20 years. 95% of the time I see web sockets it's unnecessary.

lxgr · 2025-04-12T03:41:42 1744429302

> We can’t reliably say “the next message” received on the stream is the result of the previous command since the server could have sent any number of messages in between now and then.

Doing so is a protocol decision though, isn't it?

If the protocol specifies that the server either clearly identifies responses as such, or only ever sends responses, and further doesn't send responses out of order, I don't see any difference to pipelined HTTP: The client just has to count, nothing more. (Then again, if that's the use case, long-lived HTTP connections would do the trick just as well.)