yes, it like surfing porn in the early internet year using a dialup modem. One l...

Kiro · 2024-03-14T07:39:35 1710401975

How is hiding it behind a loading spinner any better? You still can't spam it with questions since you need to wait for it to finish. With streaming you can at least hit the stop button if it looks incorrect, so you actually spam it more with it enabled.

silversmith · 2024-03-14T10:46:06 1710413166

For me, the constant visual changes of new parts being streamed in are annoying, and straining on the eyes. Ideally, web frontends would honor `prefers-reduced-motion` and buffer the response when set.

Prosammer · 2024-03-14T11:26:27 1710415587

Personally, I've fallen in love with that visual effect of streaming text you're talking about. It's a bit pavlovian, but I think in my head it signifies that I'm reading something high signal (even though it isn't always).

SoulAuctioneer · 2024-03-14T16:14:39 1710432879

It's more about UX, to reduce the perceived delay. LLMs inherently stream their responses, but if you wait until the LLM has finished inference, the user is sitting around twiddling their thumbs.