Using AI to generate your voice-over: good or bad idea?

We live in a time when artificial intelligence (AI) is at the heart of every discussion. And with good reason: it’s a revolution! Generating images in the blink of an eye, audio files (for example, original musical creations), proposing stock optimisation in companies… so many things that AI can do, in no time at all, and for free! The question of the AI voice-over battle is a very real one.

Artificial intelligence can of course also generate voice-overs: with this in mind, even the SNCF could abandon the emblematic voice of Simone Hérault in favour of artificial intelligence. So, is using AI to generate your voice-over a good or bad idea?

First of all, what is artificial intelligence (AI)?

According to the Larousse dictionary, artificial intelligence is a set of theories and techniques used to create machines capable of simulating human intelligence.

The field of voice-overs, and actors in general (see article on the future of extras), is directly affected by these technological developments.

Artificial intelligence, a revolution in the world of voice-overs

AI is progressing fast in many sectors, fuelled by the continuous learning that humans subject it to. Consumer technologies such as ChatGPT and Midjourney, used for text and image generation, have grown exponentially. The use of AI has significant advantages, which we will look at together.

The economic advantage

The first argument that speaks to everyone is the economic advantage.

Artificial intelligence in the voice-over field is called TTS (text-to-speech). This involves entering the text on a website, which then converts it into audio. The self-service text-to-speech artificial intelligences available online are often free, which inevitably reduces the cost of producing the audio file.

What’s more, even paid solutions are often more cost-effective than using professional voice actors, particularly for long, tedious projects. Text-to-speech artificial intelligences are often used in areas such as Google Translate voices, where the ‘human’ aspect is not particularly sought after, but where the volume/number of words is significant.


The other major argument is the speed of delivery of the audio file ordered by the customer. When you’re dealing with text-to-speech artificial intelligence, all you have to do is enter your text to get an instant audio result, whereas a voice-over actor will inevitably have a delivery time. For example, my average delivery time is 48 hours from receipt of the script. This deadline is, of course, adaptable to the needs of clients and can be reduced to 24 hours or even 2 hours for express requests (in this case, please call me directly on my telephone number +33 6 20 76 26 43).

What’s more, the AI is capable of generating the same audio in several languages, which is a significant advantage when broadcasting internationally or on the web, such as on YouTube. A voice-over actor is generally only fluent in one or two languages. If you have international requirements, you will need to use several voice-over actors for the same project. Usually one per language.

The limits of artificial intelligence

However, in the field of voice-overs, AI has not yet reached the level of understanding and expression that a human voice can offer.


The human voice has the unique ability to convey emotions authentically. Although AI can generate so-called “natural” voices, it is difficult for it to capture and reproduce the subtleties of human emotion. Actors’ natural micro-expressions and vocal variations are unique to humans. And although they may seem inaudible, it’s all in the feeling.

The same applies to actors in films and video games. In video game cutscenes, or in the film Anita (a film in which the main character is played by a 3D figure), facial micro-expressions are not reproduced. This gives the impression of a frozen, false face. It’s a bit like coming face to face with a wax statue at the Musée Grévin and thinking “There’s something missing”. The same applies to the voice. It takes the viewer out of the story, out of the emotion.

This emotion is all the more important in advertising. Just listen to the sensual voice-overs in perfume ads such as Dior or coffee ads such as Or café. The voice-over does not provide any information about the brand or the product in the text, and yet it creates an emotion in the consumer. In fact, there is ONLY emotion. Advertising is there to seduce the consumer, and seduction is a very human thing. I already spoke to you about the importance of voice-overs in advertising in a previous article.

Dior ad

Furthermore, artificial intelligences are all developed in the same way. Admittedly, they take their cues from real voice-over actors by analysing their voices, trying to reproduce them, etc., but they will all sound the same and have more or less the same intonations.

Humans bring a touch of authenticity that machines cannot yet match.

Customer relations

The first step when a customer uses a voice-over actor is to tell them what they need. The actor’s experience and professionalism will enable him or her to guide the customer if he or she is unsure, or to ask relevant questions to best meet the customer’s needs. When we call on the services of an actor, we not only have a voice, but also advice, experience, an attentive ear and an overall understanding of the project.

I don’t need to tell you how many times I’ve received badly translated scripts, which I had to proofread thoroughly to get a professional result.

And what about pronunciation? Some brand names can be pronounced differently depending on the language. For example, ‘Nike’ will not be pronounced the same way in French (Naïke) as in Spanish (Niké). Some names are also difficult for text-to-speech artificial intelligence to understand because of the possible play on words. The customer cannot set the parameters of the artificial intelligence and teach it to pronounce a particular word in a particular way. The actor, on the other hand, will be able to adapt his pronunciation to the client’s needs.

To use an AI, it is essential for the client to know exactly what result they want, to use the right terms that the AI can understand, and then to know how to guide it if it needs to adjust the tone, the rhythm, the target audience, etc. The relationship is therefore completely reversed, because the AI has to know exactly what the client wants.

The relationship is therefore completely reversed, because the artificial intelligence cannot take the initiative and simply responds to the words written by the customer, without any interpretation.


Ultimately, despite the rapid progress of artificial intelligence and the questions it raises, it remains a machine at the service of humans, not a replacement for them. While it can be a valuable tool for certain very specific uses (Google Translate voices), AI cannot yet fully reproduce the essence of human emotions or meet all customer needs.

And even if for certain projects the use of artificial intelligence may be preferred, learning from a computer-generated voice that your train no. 4876 from Montpellier to Paris will be 3 hours late will always be less pleasant than learning it from the soft, calm voice of Simone Hérault.

The voice-over profession remains an art in which the human voice retains its irreplaceable place.

Not ready to switch to artificial intelligence for your voice-over? Then contact me!

Poursuivez votre lecture avec cette sélection d'article

Logo de Justine Petitjean Comédienne voix-off française

Don’t wait any longer, book the voice-over for your project

Write to me via the link below, I will respond to you as soon as I receive your message.

Copyright © 2024 Créé par La Main Web