VA is a real pain, especially for an indie game dev.
It completely changes your workflow because you need to have your script ready and finalized to record all the lines all at once. Say good bye to iterative development. Want to tweak something or add additional lines? Yeah, that is going to cost you.
I think "AI" based voice generation (well data driven voice generation, I will die on the hill that it should not be called AI as it is not intelligent) will absolutely have a place for prototyping games and having filler voice acting.
The problem is though, that good VA can absolutely make or break a game. Will players really tolerate "AI" voices? I hope not! There is a reason certain famous voice actors are paid good money: it is an art.
People will watch certain shows or play certain video games simply because they have a voice actor that they like in. You can't get that emotional attachment form generated voices.
Voice carries so much more information than just the text. It gives hints to the emotional state of the speaker, their background and education, their intentions. "AI" can never replace top-tier voice acting from humans simply because that would require a general intelligence.
> You can't get that emotional attachment form generated voices.
Vocaloid managed to turn people’s emotional attachment to generated voices into a billion-dollar industry
(I’m actually crying right now, just remembering the emotional impact of The Disappearance of Hatsune Miku — a song written from the perspective of a piece of voice synthesis software which is in the process of being uninstalled, and performed at a speed no human singer could match, as she tries to squeeze in one last goodbye T_T)
"AI" can never replace top-tier voice acting from humans simply because that would require a general intelligence.
This exact same quote has been used as a desperate means of gatekeeping for what it means to be intelligent across every domain. Deep blue with chess, AlphaGo, diffusion generative art, the list goes on and on. It's amazing that people continue to make this mistake to me.
Everyone thinks that their domain is safe, and everyone is inevitably wrong.
There have already been many high profile realistic sounding TTS engines in the last 10 years. Some of the newest advancements allow you to combine a TTS voice with a recording of your own voice. So a game developer can record a line of dialogue angrily and with emotion, and then this is transferred onto the appropriate TTS generated voice (elf princess, dwarven blacksmith, etc).
Voice acting is a field absolutely ripe for disruption.
People will watch certain shows or play certain video games simply because they have a voice actor that they like in.
There are certainly people who do this, but you're projecting. This is objectively a minority of the target market. I could canvas 100 children who have watched Amphibia or owlhouse and it's doubtful a single one of them could name a single voice actor in those shows.
This applies even more so to voice acting in video games, I'm an avid game player everything from GTA, to tomb raider, to portal, to baldurs gate. Gun to my head and I couldn't name a single person who voiced a single line.
It completely changes your workflow because you need to have your script ready and finalized to record all the lines all at once. Say good bye to iterative development. Want to tweak something or add additional lines? Yeah, that is going to cost you.
I think "AI" based voice generation (well data driven voice generation, I will die on the hill that it should not be called AI as it is not intelligent) will absolutely have a place for prototyping games and having filler voice acting.
The problem is though, that good VA can absolutely make or break a game. Will players really tolerate "AI" voices? I hope not! There is a reason certain famous voice actors are paid good money: it is an art.
People will watch certain shows or play certain video games simply because they have a voice actor that they like in. You can't get that emotional attachment form generated voices.
Voice carries so much more information than just the text. It gives hints to the emotional state of the speaker, their background and education, their intentions. "AI" can never replace top-tier voice acting from humans simply because that would require a general intelligence.