Yeah its completely seperate. The LLM just gets some extra text in the prompt, t...

CamperBob2 · on June 3, 2023

One thing I don't understand is how feeding the entire conversation back as a prefix for every prompt doesn't waste the entire 4K-token context almost immediately. I'd swear that a given ChatGPT window is stateful, somehow, just for that reason alone... but everything I've read suggests that it's not.

thomasahle · on June 2, 2023

Have you tried something like Memory Transformers https://arxiv.org/abs/2006.11527 where you move the k/v pairs that don't fit in the context window to a vector db? Seems like a more general approach, but I have tested then against each other.