> Since decoder-only transformer memory requirements scale with the square of se...

terabytest · on Feb 20, 2023

This is very interesting. Could you please elaborate and maybe share links to articles if you know of any?

baobabKoodaa · on Feb 20, 2023

I don't have any sources to refer to, but "text summarization" is one of the common NLP tasks that LLMs are often benchmarked on. All of these general-purpose LLMs will be able to do a decent job at text summarization (some, such as ChatGPT, will be able to do zero-shot summarizations at high quality, whereas others need to be fine tuned for the task). If your problem is that you are feeding a large amount of text to the model and that is slow/expensive, then summarization will obviously remediate that issue. After summarizing most of the input text you still need to feed in the latest input without summarization, so for example if the user asks a question, the LLM can then accurately answer that question. (If all of the input goes into summarization, that last question may not even appear in the summarization, so results will be crap.)