It’s because they used data from the public to train their models.
If, suddenly, someone would apply the fact that copyright bans remixes to training of neural networks, and apply the fact that licenses for this have to be granted explicitly, Google would lose 90% of their advantage over other companies.
Personally, I’d be for making a requirement that companies open source their trained models if the training data contained data supplied by users, not paid employees.
I think the world needs the equivalent of OpenStreetMap but for speech data, so that the data is under a copyleft license that legally enforces reciprocation when the corpus is used or modified.
Kaldi is much better, but very difficult to set up.
None of the open source speech recognition systems (or commercial for that matter) come close to Google.