Yes. Solving context length has been tried in hundreds of different approaches, ...

Loquebantur · on May 31, 2023

I assume, there is a common point of failure?

Notably, human working memory isn't great either. Which begs the question (if the comparison is valid) as to whether that limitation might be fundamental.

visarga · on May 31, 2023

The failure mode is that only long context tasks benefit, short ones work fast enough with full attention, and better. It's amazing that OpenAI never used them in any serious LLM even though training costs are huge.