With Tortoise TTS you can add input voice samples from different people to produce new voices that vaguely sound like a combination of the two. It's a bit of a crapshoot, sometimes resulting in wild things like a female voice morphing into a male mid-sentence or sudden gasps and bizarre demonic croaking. Playing with it has been pretty fun.
I've even used Japanese inputs and the outputs kind of sound like the originals speaking native English. Still a bit uncanny and less stable than inputs that match the model's language.
Forget deepfaking, it's quite literally a bottomless pit of unique voices that don't exist.
I've even used Japanese inputs and the outputs kind of sound like the originals speaking native English. Still a bit uncanny and less stable than inputs that match the model's language.
Forget deepfaking, it's quite literally a bottomless pit of unique voices that don't exist.