I’ve absolutely explored this idea but, similar to lossy compression, sometimes important nuance is lost in the process. There is both an art and science to recalling the gently compacted information and being able to recognize when it needs to be repeated back.
If there was something like Objects in OO programming, but for LLM’s, would that solve this?
Like a Topic-based Personality Construct where the model first determines which of its “selves” should answer the question, and then grabs appropriate context given the situation.