Still I don't totally understand how that huge of a context works for Gemini. I ...

Paradigma11 · 2025-07-20T13:33:14 1753018394

I dont know how the massive context works but Caching is certainly a thing and cheaper: https://ai.google.dev/gemini-api/docs/caching?lang=python

Gemini is better than Sonnet if you have broad questions that concern a large codebase, the context size seems to help there. People also use subagents for specific purposes to keep each context size manageable, if possible.

On a related note I think the agent metaphor is a bit harmful because it suggests state while the LLM is stateless.