The actor told an audience in London that AI was a “burning issue” for actors.
Actor Stephen Fry says his voice was stolen from the Harry Potter audiobooks and replicated by AI—and warns this is just the beginning::The actor told an audience in London that AI was a “burning issue” for actors on strike.
I’m sorry, while I understand this issue is a more visible issue for actors/voice actors, there are a lot of people who are going to be hurt by this in the long run.
You think scam calls are bad now? Imagine if Gamgam gets a call from “you” saying you’re hurt and scared and need money to be safe. And I don’t mean just someone pretending to be you, I mean that the person on the other end of the phone sounds exactly like you, up to and including the pauses in your voice, the words chosen to say, and even the way you roll your r’s. All because someone skimmed your public Facebook videos.
Someone wants that promotion you’re going to get? Record your voice a few times, then have you “drunk call” your boss hitting on them, and then harassing them when they don’t react well to it.
This is the exact kind of thing people were worried about years ago when I first started using the internet, and it wasn't even possible yet! Common practices included never giving your real name for anything, and never posting pictures or video of yourself.
So no daily mirror selfies or extensive vacation albums? No checking in anywhere? No open discussions on subjects that could be used as data points to create a digital profile of me? Why even use social media then?
I’m sorry, while I understand this issue is a more visible issue for actors/voice actors, there are a lot of people who are going to be hurt by this in the long run.
I'm sorry, but as somebody who's tried out the tech, the amount of vocal processing required is still many hours of data. Even the more professional AI cloning web sites that allow you to clone your own voice require that you submit "a couple of hours" of your voice data. The reason why musicians and voice actors get into the middle of this is because they already have many hours of voice work just out there. And in many cases, the speech-to-text transcription, which is required to train a voice model, is already available. For example, an audio book.
You think scam calls are bad now?
You think scam call centers are going to spend the time to look for voice clips, parse them out, transpose them into text, put them in a model, train that model for many hours, realize the Python code needs some goddamn dependency that will take many more to debug, fix parameter settings, and then get a subpar voice model that couldn't fool anybody because they don't have enough voice clips.
They can't even be bothered to look up public information about the caller they are making the call to. Fuck, the last call I got was from a "support center for your service", and when I asked "which service?", they immediately hung up. They do not give a fuck about trying to come prepared with your personal details. They want the easiest mark possible that doesn't ask questions and can get scammed without even knowing their name.
Imagine if Gamgam gets a call
Who's Gamgam?
Record your voice a few times
Yeah, sorry, you need more than a "few times" or a "few voice clips".
Huh, duckduckgo came up with favorite Southern grandma names and 50 best grandma names as the second and third articles. You do have to know what to type in and not always look at the first thing that comes up. I searched "who is a gamgam" and found tons of stuff about grandmas.
I have personally used VALL-E and tried it out. What they are claiming is absolute bullshit. It is "a" voice, but it's certainly nowhere close to "your" voice. Don't believe me? You can try it out yourself.
Perhaps owing to VALL-E's ability to potentially fuel mischief and deception, Microsoft has not provided VALL-E code for others to experiment with, so we could not test VALL-E's capabilities.
You think maybe that Github release, which isn't from Microsoft, might not be the same thing despite the name?
Amazon showed off voice cloning over a year ago, and iirc it was claimed to not require hours of content. You’re lagging in your understanding of current capabilities, nevermind the fact that I was talking about the near future.