There's another side to that medal: At the moment nobody takes any issue with OpenAI doing filtering and curation in deciding what is part of their training data set, aside from perhaps the anti-bias crowd. "AI neutrality" is not yet a topic. Yet.
I've already seen that several times with image generation. Most recently was an article commenting on how the American smile was polluting generated photos. People can't decide what they want. Do they want licensed, curated commercial photos in the database or do they want search engine style neutrality? You really can't have both.