It's not much different than your proxy idea.
It's implemented as a transformation between the internal message structure (close to Anthropic API's) to OpenAI message spec and vice-versa. Then, it's calling all the other models using the openai-node client, as pretty much everyone supports that now (openrouter, ollama, etc)