It's more about UX, to reduce the perceived delay. LLMs inherently stream their ...

SoulAuctioneer 10 months ago | parent | context | favorite | on: OpenAI: Streaming is now available in the Assistan...

It's more about UX, to reduce the perceived delay. LLMs inherently stream their responses, but if you wait until the LLM has finished inference, the user is sitting around twiddling their thumbs.