In cybersecurity red and blue test are two equal forces. In software development the analogy I think is a stretch, coding and testing are not two equal forces. Test is code too, and as such, it has bugs too. Test runs afoul with police paradox: Who polices the police? The Police police the police.
I interpret it a different way than that. I see application code and testing code as both a part of blue team. It's the code reviews and architectural critiques that are part of red team.
Personally, I've found GitHub's feature of AI PR reviewers exceptionally helpful. I think that's the type of red team LLM app Tao is describing here.