Video game actors are worried about AI tech taking over their jobs

Self-service checkouts, lights-out manufacturing, self-driving cars, and wars raged by robotic drones – automation is coming for everyone’s jobs eventually, and our capitalist systems aren’t equipped to carry us through it.

As I type this, teams are working on machine-learning AI that can write simple news articles with perfect grammar and without rambling introductions. They can’t be self-deprecating like me – and they certainly can’t interview people – but they sure can put words in order.

No one is safe from the death march of technological advancement. Not even actors.

Altered AI is a company that promises to give game developers the tools to “create compelling, professional voice performances.” It has a library of around 20 professional actors and hundreds of what they call more “common” voices that devs can use to populate their games.

Just submit a recording of what you want to say and how you want it said, and a “performance” comes out on the other side. Or you can input your actors’ performances and change their tone, voice type, and more. There’s even an example on the website where a male actor’s line read is changed to sound feminine, which definitely won’t be abused in horrible ways. No, sir.

“I never believed in human replacement,” Altered AI CEO and former Google employee Ioannis Agiomyrgiannakis tells GLHF, before likening the tech to using a car to go fast or a forklift to carry a heavy load.

This tech is already out there. If you’ve played The Ascent, a recent twin-stick shooter set in a cyberpunk world, you’ve already played a game that features AI voices. Altered AI also works with triple-A game developers – Ninja Theory, developer of the upcoming Hellblade 2: Senua’s Saga, is one. Of course, the details of their partnership are under wraps.

Most of the big developers the company works with have clauses in the NDA that forbid Agiomyrgiannakis from announcing the partnership. The games industry is full of secrets, but developers don’t want you to know about this, just like how they don’t want you to know about the deals they may or may not have with arms manufacturers.

“What we are making are tools that allow people to do performance by themselves,” Agiomyrgiannakis explains. “People in the gaming industry use us for prototyping. When you have a dialogue, you have a level of imagination. But when you take the dialogue to the voice actors, it comes back and doesn’t sound as dynamic as you wanted it to. So there’s a gap between how the writer imagines the dialogue, and how the dialogue plays out. We provide an intermediate step where they can prototype the dialogue and have a checkpoint before they hit the studio.”

But is that the right way to tease out a good performance? “Here, I know you have acting school and 20 years of experience under your belt, but can you be more like Autotune HAL 9000, please?” Ask a bunch of actors and they’ll tell you why this is a bad idea.

“Maybe you’re familiar with line reads – saying the line the way you think you want it to sound so the actor can copy it,” Sarah Elmaleh, who plays Gears 5’s Lizzie Carmine, says. “Line reads are usually an unfortunate last-ditch failure of direction, and ‘copying’ usually sounds dead on its feet. When you hire an actor, you get so much more than a sound. Great dialogue doesn’t just come from the mouth, it comes first from the heart. Mouth-to-mouth is good for CPR, bad for acting.

“Actors hook into the environment, relationships, history, and intention, and you can hear all of those things in their delivery. You can either help guide them to those things or you can force them to reverse-engineer backward from the ‘sound’ of your read. The read does not work without some or all of those things authentically in place. Some of the most wonderful moments in a session are when an actor surprises you and maybe even themselves with a deeply motivated choice you never expected.”

Red Dead Redemption 2’s Roger Clark has similar concerns. When I speak to him, he tells me the situation with AI makes him think of the tale of King Canute and the tide, in which he tells his courtiers that he has no power to stop the elements. Automation seems inevitable at this point and if you want to stop it, you might as well get a job as a janitor and try mopping up the sea.

“I feel AI is a viable solution for some, but I’d be lying if there weren’t some concerns as to how it may impact actors and their capacity to work,” Clark says. “I feel that humanity cannot be digitised. We are all experts on being human and can sniff out imitations with impressive speed and accuracy. I am interested in what AI can do, but its capacity for imitating real people is alarming, and I know we have all speculated the potential damage that could do – legally, financially, and reputably.

“I think the true test will be when AI can seamlessly ‘work’ with real actors and the audience won’t be able to tell the difference. When AI can play off of a real person and adapt and react truthfully as another actor could, then we’re in trouble. My favourite thing about performance is when an actor surprises me but remains loyal to the character, situation, and to basic human nature. People are complex, fascinating creatures and will always have the edge when it comes to reactive adaptability. When two or more actors are working together and have a genuine connection, it’s magical. Human audiences can smell authenticity a mile away — at least I sure hope so, otherwise I guess it’s time to go back to bartending. I just hope they don’t have a machine doing that as well.”

Sorry to say it, Mr Clark, but there are already bars where robots serve you.

When working on Firewatch, an indie game about a man who heads off to work on a nature reserve to escape a traumatic event, collaboration in the voice-over booth became the very soul of the game. Played entirely from the first-person perspective, Firewatch swaps out flashy cutscenes for walkie-talkie conversations, encouraging the player to intuit emotions through audio.

“A human performance is at the heart of great games,” Cissy Jones, who plays Firewatch’s Delilah, explains. “A synthetic performance is soulless by definition, thus losing the art, the collaboration, the creative spark that comes from people working together to create the narrative and emotional immersion that gamers and broader audiences deserve. Working on Firewatch was a collaborative process every time we got in the booth, reworking lines and making each other laugh to figure out what worked – together. The reason people hire voice actors is because we bring the unexpected. We make words on a page come to life. That’s the magic.”

I’m invited to another video call, this time to show me a demonstration of what Altered AI can do. The presenter is Caitlyn Oenbrick Rainey, who worked in voice acting and is now in-house at Altered.

One of the ways Altered can work is its text-to-speech model, which takes the inputted text and spits an audio file back out. This is where the tech is useful for prototyping. Traditionally, someone at the game studio would voice these lines and they’d be used as a placeholder until the real performance happens. This makes me remember Ninja Theory’s Hellblade, which cast the studio’s video editor, Melina Juergens, after the team realised she perfectly fit the character during the prototyping stage. Juergens has won awards for her work on the game. It’s ironic that Ninja Theory is one of the first big studios to work with the tech.

Next, Rainey shows me a voice sample she recorded earlier, then alters it so she sounds like a 40-year-old man from the Deep South. There’s another where she speaks a line and converts it to a whisper, then another where she does the same and converts it to a shouting man. It can even take a performance in another language and convert that while keeping the accent intact. According to Rainey, the tech is focused on the prosody of the performance and can transfer the pitch, the dynamic, and even the energy of a line read.

This is the other side of Altered AI, outside of prototyping. The technology can, will, and probably already does go further than that. Background characters are one area where developers could significantly reduce costs if they used this tech over actual actors.

In one suggested use case, main characters can be played by actual actors, while crowds and other additional characters are populated with AI. Agiomyrgiannakis claims this could lead to more work for actors, not less. He says he’s already been approached by developers who originally planned to use text bubbles for dialogue, and instead opted to use AI. And, of course, someone has to feed the AI. As I mentioned earlier in the piece, 20 professional actors (all anonymous, some award-winning) have already signed their voices over to the company.

“We hide them,” Agiomyrgiannakis says. “We’ve hidden them so hard that we don’t even know. We never got exposed to their names.”

When I get a look at the back end of the site, I realise he’s not joking – they’re all called “Dennis” or “Rod” and have avatars that look like stock photos or the staff page of an car trader. They’re hidden because they’re afraid of a backlash from their peers, who are worried that the tech will lead to a reduction in opportunities.

“Bloggers didn’t kill the newspapers,” Agiomyrgiannakis says. “YouTubers didn’t kill the TV. People just consume more nowadays.”

For many performers, these background roles are where they cut their teeth. And with more and more triple-A games casting Hollywood talent for their major roles, there’s a real fear that this kind of technology could prevent fresh talent from breaking in.

“AI works for minor game performances,” The Expanse’s Elias Toufexis explains, “but it still doesn’t work for real performance. Go watch the Boba Fett episode with Luke Skywalker in it – his whole vocal performance is AI and it’s terrible. If they need it for ‘grenade!’ and ‘get down!’ in Call of Duty-type games, it’s fine. It’s going to hurt a bunch of new voice actors, though, because that’s a window in for a lot of us.”

Horizon Zero Dawn actor Ashly Burch agrees: “I completely understand the desire for affordable VO for indie developers,” she says. “What I think a lot of people don’t know is that SAG-AFTRA (the American actors’ union) has a low-budget agreement to address this issue. It’s specifically designed so indie developers can get access to quality VO without breaking the bank.

“Artistically, you’re never going to get a truly dynamic and compelling performance from an AI. A few combat barks? Maybe. But if you’re looking for something human and nuanced and alive, AI isn’t going to cut it. Low-budget or smaller titles are where a lot of new VO folks get their start. If devs transition to AI, an entire entry point for young artists is being squeezed out.”

The people at Altered AI believe this will turn out differently. Agiomyrgiannakis says he would be concerned “if the volume of NPCs was fixed”, but he believes that the tech will just lead to bigger worlds with more characters, and game developer budgets will still be spent on actors, but they will be able “to record 10 times more”.

In Agiomyrgiannakis’s vision of the future, actors who are starting out won’t be squeezed out. Instead, they will work for him. “Who’s going to drive the voices?” he asks. “We need actors to drive the voices.”

Altered AI doesn’t scrape from the internet as the AI art apps do – it gets new performances plugged into it, which you can then synthesise and change in perpetuity. But since there hasn’t been an open dialogue between the unions, the actors, and the tech creators, who knows what these job offers will even look like?

In one possible future, actors are cutting their teeth working for AI companies, but since their work is anonymised and altered before being used, they’re not getting credits. In this world, only those who are already a name will land major roles, since we still need that raw humanity to drive stories. Agiomyrgiannakis is the first to admit that AI can’t innovate in the same way a real actor can, so the tech has its limitations and won’t ever completely take over from humans.

I ask whether his AI can capture emotion and hit the viewer in the heart in the same way a good actor can. Let’s say I’m an indie developer, I say, and I want to make something like The Last of Us. I have something similar to that opening scene where Joel is hugging his dying daughter. The audience feels it. Can your tool do that?

“I believe so,” Agiomyrgiannakis replies. “If you check out our videos, you can see, for example, a video called Lincoln.”

I’ve since watched the video and I don’t think actors have anything to worry about with this tech stealing major roles just yet. It shows an original performance intercut by the AI doing its best impression of Daniel Day-Lewis in the scene. You could honestly drag someone random off the street and they would do a better job.

Scrub through the footage to 1:15 – it can’t even say the word “blood”, and any moments of rage do not translate across at all. This is a showcase video, so it’s clearly supposed to be a highlight. It makes you wonder what the outtakes were like.

Another concern actors have is how this kind of tech is already influencing the industry – an industry that can spend hundreds of thousands of pounds to pay streamers to play their games, but will cut costs in almost every other way imaginable. Already, actors are seeing contracts that have problematic clauses, where actors could sign their movements and voice-over to be used forever without extra compensation. It’s a real issue in an industry that’s already way behind movies on actor compensation, where actors are typically paid residuals based on how well the film sells. There’s generally nothing of the sort in video games for developers or actors – the CEOs do alright, though.

“I can understand wanting to make things cheaper and easier for people when they may not have the budget, but actors have always worked with companies to find fair practices,” Marvel’s Spider-Man actor Yuri Lowenthal explains. “Underestimating the actor’s contribution can lead to exploitation, and could be avoided by starting a conversation with actors so we can make it work for everyone. As of now, I don’t think anyone from these AI companies has reached out to us as a whole, to see if we can agree on what might be fair use and fair compensation for the use of our voices, our performances.

“There is no morally sound financial shortcut here. I’ve, of late, started to catch very vague clauses in actors’ contracts that allow companies to use our performances for whatever they want in perpetuity, and maybe already have done so in order to develop this technology. In fact, I know an actor who does a lot of performance capture and voice work and she has seen her very specific movement show up in games she never even worked on, which means her data sets were either repurposed for other projects she never signed off on or, even worse, sold to other companies without her knowledge. This is a scary precedent that has already been set, and I want to start a conversation with AI companies about how we could protect actors, and again, the ecosystem of storytelling.”

The concerns aren’t just about missing payments for your performance being used in other materials, either. What if your voice, body, or face ends up getting attached to something you’re morally opposed to? Tech like Altered AI could potentially open the floodgates to this, and developers have already shown that they’re willing to dive right into the murky waters.

“Too many companies are asking actors to sign horrible contracts with zero input on the final product for an often-crappy one-time buyout,” Cissy Jones explains. “I definitely understand how tech like this can be intriguing for indie games, but if we have no guardrails as actors, our voices could end up being used for offensive materials or inappropriate casting.

“We’ve seen companies that slip in clauses that give them the rights to use recordings from a non-AI session that will be – or has already been – used to create a synthetic voice. We are finding all manner of hidden clauses, buried details, and snuck-in verbiage, enough to make your head spin. It’s incredibly disheartening because very few companies – if any – are asking for the actors’ input into what is fair.

“Everyone – union, non-union, and everything in between – needs to be aware of what is at stake and what their rights should be, and they need to be aware of what their contracts actually say. The broader implications of this technology are frightening. It’s a path to misuse and deepfakes. There need to be protections and guardrails for all of us to prevent abuses. I think there could be a right way for this to be done but we all need a seat at the table.”

When I speak to Benjamin Byron Davis – the actor who plays Dutch in Red Dead Redemption – about the tech, he tells me a story about when he used to have a Facebook account. He had a professional headshot done for his portfolio and he ran it through a filter to make it look like a painting, then used it as his Facebook profile photo. One of his friends commented on the photo, saying it made them feel sad.

“What made him sad about it was that there was no need for the painter,” Davis says. “And in not having a need for the painter, you don’t have the need for all the experience that is required to develop as a painter. And so now we have an image that looks very much like a beautiful painting, but it is done by an algorithm or a machine. I can’t pretend to know exactly how any of these filters work. And certainly, there is artistry in creating these filters. But yeah, there is something sorrowful in being able to create this outcome without requiring the development of the tools that make training, so essential to our human experience. And from there, you then have to ask the question of what that then does to the audience’s palate. Do we cease to value art itself, and what an artist can do, and what an artist does?”

Davis’ brothers are of similar build to him, they sound a bit like him, but they can’t do what he can do with his voice and his body when he’s in front of the camera or on a performance capture stage. Likewise, he can’t speak about law in the same way his lawyer brother, Joshua, can. He can’t play basketball as well as his brother Alex can, either. We build up these skills over a lifetime.

“Now, again, this technology is coming whether anybody likes it or not,” he says. “There is no standing against the tide on this, it is coming. But I do think it’s important to be clear-eyed about what we may be losing.”

Of course, Ninja Theory and the game development studios working with Altered AI aren’t the only ones who are attempting to cut corners with this new tech. Before I started typing out these words, I put my interview recordings into Otter AI and got it to transcribe them for me. The process is never perfect, but it saves me hours on an article that already took me a week to put together. And while there are concerns about AI art for artists, some are using them to speed up their workflow, generating images for ideation and photo bashing, which itself is a method of creating digital art from other images – transforming something that exists until it’s unrecognisable and is its own piece of art.

Maybe one day this tech will be useful for actors in the same way, but they need to have a seat at the table, along with the unions, and developers need to talk to them to figure out what’s fair and right. Otherwise, we’re going to push the talent that drives game stories out of the industry altogether.

Sarah Elmaleh remembers Benedict Cumberbatch playing the dragon Smaug in The Hobbit movies and hearing his voice processing in real-time, and points out how this tech isn’t the same. “That’s an actor wearing a costume,” she says. “When you imitate an actor, that’s an invocation – generally there’s an instinctive dual perception on the part of the audience of both the artist present and the artist in reference. But when you purposefully and successfully elide that duality, when you wear actors as costumes, that feels more like… a skinsuit? An uncanny and dishonest simulacrum with none of the original’s expressiveness, none of the maturity, spontaneity, and particularity of their artistic choices moment-to-moment. To me, it’s Cronenbergian.”

I reached out to SAG-AFTRA to get a closing comment and a spokesperson came back with the following:

“SAG-AFTRA has contracts for video games and all other forms of voiceover that let producers and developers at nearly every budget level hire professional voice actors like those interviewed for this article. As technology continues to change the entertainment and media landscape, we will continue to create contracts that are fair and protective for performers and responsive to the needs of the companies that wish to employ them. We are also adding or negotiating language into our existing contracts that provides critical protection from misuse or unauthorised use of member’s voice or image through technology.

“Protection of a performer’s digital self is a critical issue for SAG-AFTRA and our members. These new technologies offer exciting new opportunities but can also pose potential threats to performers’ livelihoods. It is crucial that performers control exploitation of their digital self, be properly compensated for its use, and be able to provide informed consent.

“It is critical that performers who work with these companies understand what they are agreeing to, including whether any ethics policy or safeguard could be changed in the future, and ensure they are protected. For example, Altered AI is based in London — US-based performers need to know how it will impact them in the event of a dispute.

“Among the key provisions we include in AI-related contracts are: safe storage of a performer’s recordings and data; usage limitations and the right to consent — or not — to uses of the performer’s digital double; transparency around how the digital double will be used; and appropriate compensation.

“Most importantly, a SAG-AFTRA contract puts the power and expertise of the union behind the performer, both in negotiating and enforcing contracts. We have lawyers and staff who focus on digital and AI technology. We know that change is coming. SAG-AFTRA is committed to keeping our members safe from unauthorised or improper use of their voice, image or performance, regardless of the technology employed. The best way for a performer to venture into this new world is armed with a union contract.”

King Canute might not have been able to push back the tide, but he never tried asking his advisers to negotiate with it.

Written by Kirk McKeand on behalf of GLHF.

Leave a Comment