How to easily turn all your articles into audio articles | What’s New in Publishing

Text-to-speech technology has become more accessible than you might think, and you should include an audio version of your articles. There are no more excuses.

At the end of 2016, the Danish online magazine Zeeland made the decision to switch to audio based on reader requests. In the summer of 2017, Zeeland started publishing all the articles in audio form and things started to change in earnest.

In two months 40% of consumption was audio, in less than 6 months it was 50%. And within a year, 70% of all consumption was audio. The move has improved member retention and satisfaction. Journalists read their own stories.

Zeeland exceeded 28,000 members (Danish population: 5.8 million) in 2021, and has been operating financially viable since 2019.

After The New York Times bought Audm, a startup that turns long-form journalism into audio content, some of its articles started appearing with an audio version read by its author.

Of course, having read the articles of its authors is a very good brand-building exercise, as we have seen both with Zeeland and NY Times. However, some publishers are increasingly turning to text-to-speech technology and using artificial and neural voices to read their articles aloud.

Use text-to-speech apps to create audio versions of your stories

During the pandemic, The Wall Street Journal reached its digital subscriber record and also surpassed its overall traffic record. With this in mind, they have conducted a number of experiments with the aim of attracting new, less committed members (who visit WSJ less than 10 days per month) returning more often.

One of the most successful experiments was the “Listen to this Story” feature, an auto-generated text-to-speech audio version of every story on the website. The newspaper said it turned out to be more addictive than their popular crosswords. Most importantly, it was universally well received by readers younger and older.

WSJ has built its own text-to-speech (TTS) player, which is connected to one of the many cloud-based machine learning solutions offered by major tech companies. You can use Google’s TTS API or opt for the Amazon Polly API (like The Washington Post a) or another major cloud-based technology provider of such services.

Of course, if you are not WSJ Where The Washington Post and have limited resources, this may seem like a lofty goal. Well, not anymore.

As is almost always the case with technology, it only takes a while for intermediary services to emerge and offer ready-to-use solutions for a much more reasonable price than loading a whole team of developers. to build functionality from scratch. at the top.

After doing a little research, I’ve compiled five services below that you can start using right away, along with examples of websites that use them.

Now they are all in English, but don’t let that bother you. Here is a list of languages ​​offered by the Google API (which most of the listed services use): Afrikaans, Arabic, Bengali, Bulgarian, Catalan, Chinese, Czech, Daish, Dutch, Filipino, Finnish, French, German, Greek, Gujarati , Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Korean, Latvian, Malay, Malayalam, Mandarin, Norwegian, Polish, Portuguese, Punjabi, Romanian, Russian, Serbian, Slovak, Spanish, Swedish, Tamil, Telugu, Thai , Turkish, Ukrainian, Vietnamese.

1) Beyond words

BeyondWords is probably my favorite of these services, honestly. It is used, for example, by Journalism.co.uk. It uses AI voices and the latest text-to-speech voices from Amazon, Microsoft, Google and Yandex – over 700 voices in 64 languages.

The initial setup is quite simple, BeyondWords offers a free tier, and you can even turn your audio articles into RSS feeds and also a podcast. Here is a list of supported languages.

BeyondWords also offers voice cloning technology to create custom AI voices. You can use your own voice or create a synthetic copy of your voice and use it. The service offers WordPress integration.

2) Trinity Audio

Trinity Audio offers a number of services. Trinity Player can convert all your content to audio with just a few clicks. Here is a list of supported languages. It also offers a free tier to start with.

Trinity Audio is used to create audio articles by Variety Where McClatchy. The service also offers WordPress integration.

3) Play.ht

Play.ht also promises to generate realistic text-to-speech sound using its online AI voice generator and the best text-to-speech voices from Google, Amazon, IBM, and Microsoft. Here is a list of supported languages. It doesn’t offer a free tier, but has a nice Medium integration and WordPress integration.

Play.ht can, like BeyondWords, turn your audio stream into a podcast stream.

4) Speech

Speechify promises easy integration with your website with just 5 lines of code. It is used, for example, by Medium to automatically create an audio version for every post on the website. He’s one of my old ones Medium blogs and I never added an audio version, but now it’s there.

Speechify has voices that can read text in over 20 different languages.

5) Remixed

Remixd is used by the US-based tech online publication The edge (example) to produce audio versions of its articles. Unfortunately, the website does not provide much more information.

Start small and grow from there

I really think there are no more excuses for content websites not to provide audio versions of articles.

Of course, if you just want to make it high quality, that’s only possible with a number of languages ​​of which Google, Amazon, IBM and Microsoft also have a neural version, which is a more natural voice than Google Translate typical voice that most of us are used to hearing.

Here you can hear the difference between a standard voice and a neural voice, which synthesizes speech with a more human-like emphasis and inflection on syllables, phonemes, and words.

I did a test with my family and played them a clip of a human reading text and a neural version of an AI voice reading text. They couldn’t tell if any of them weren’t human. Yes, scary.

But on the other hand, it gives publishers a good option to turn their text-only website into a much richer experience that has a proven effect on longer visits and increased reader return.

Of course, using professionals or your own authors to read the articles is still the best experience in the sense that you can hear that it is made by a human, especially with longer texts.

However, using a service like those mentioned above or Veritone Audio can expand what you are able to do.

It seems profitablethe weekly ad tech newsletter from Podnewswas able to synthesize the voice (create his clone) of the host of the newsletter Bryan Barletta then use it to speak a language he doesn’t speak.

Barletta used Veritone Voice to create a voice model capable of speaking Spanish in her voice, her voice clone. Barletta wrote about the whole process in detail in an earlier edition of his newsletter. The result is that with voice cloning technology, he is able to reach new audiences in their native language and his audio delivery of the text doesn’t sound off-putting.

I think it’s a really smart way to use technology – to extend and build on something that a human has created.

David Tvrdon

This article originally appeared in The Fix and is republished with permission.

Comments are closed.