Still I don't totally understand how that huge of a context works for Gemini. I guess you don't provide the whole context for every request? So it keeps (but also updates) context for a specific session?
Gemini is better than Sonnet if you have broad questions that concern a large codebase, the context size seems to help there. People also use subagents for specific purposes to keep each context size manageable, if possible.
On a related note I think the agent metaphor is a bit harmful because it suggests state while the LLM is stateless.