This is really harmful and unethical work. It will be used to hurt millions of elderly people with scams. That's the real application that will happen 100x more than anything else. It's unethical and harmful to release tools that will be overwhelmingly used to hurt elderly people. What they should do about it is: Stop releasing models. Only release a service so that scammers will not use it. Also, only released audio that is watermarked, so that apps can tell that a phone call might be a scam. When they share models with researchers, use previous best practices: post a Google Form to request access.
Just imagine if this line of thinking was used elsewhere.
This tech is already out of the bag and I thank the author(s) for the contribution to humanity. The correct solution here is not to shove your head in the sand and ignore reality, but to get your government to penalize any country or company that facilitates this crime. If they can force severe penalties for other financial crimes and funding terrorism, they can do the same here.
Scammers scamming old people is already very wide spread, so should we maybe outlaw telephones as well? Or maybe mandate anti scamming filters that disconnect if something is discussed that could be a scam? If I think about it that actually would make more sense, but still be problematic.
Millions of elderly people are already getting scammed by overseas call centers so unless we do something more significant this tech will not make one iota of a difference.
That's not really true, most scammers have a male voice with a heavy accent. When they have tools that easily disguise their voice, scammers can reach many more elderly people.
That might have been true about a year ago, but I've been getting calls from well-spoken native-level scammers for about two months now. They are so frequent that I can put them on speaker during family gatherings to raise awareness.
Sample sizes of 1 are never representative but they definitely have full access to native speakers or tech that can generate very passable speech.
It seems quite possible that the change you've seen in these last two months is because some have started using these models. More likely than a sudden huge shift in either the country of origin or English skills of the scammers.
My point is that these models were already out there before StyleTTS2 was released. Plugging your ears and demanding their regulation in your country will not make them disappear.