It's not much different than your proxy idea.
It's implemented as a transformation between the internal message structure (close to Anthropic API's) to OpenAI message spec and vice-versa. Then, it's calling all the other models using the openai-node client, as pretty much everyone supports that now (openrouter, ollama, etc)
The interesting part, to me, is that claude-code is made by people who know their models well. Having the same tool work with other models lets us get a better feel of how to make better coding agents
https://github.com/dnakov/little-rat