Google Assistant can talk and sing like John Legend in the U.S., and it’s conversant in over 30 languages in 80 countries (up from 8 languages and 14 countries in 2017). But in the years since its international launch, Google’s AI interlocutor hasn’t been able to switch between more than one voice outside of the U.S. Fortunately, that’s changing.
Google today announced that Google Assistant users who’ve selected English in the U.K., English in India, French, German, Japanese, Dutch, Norwegian, Korean, or Italian will gain a second voice type with a unique cadence. They’ll join the 11 English voices already available stateside, six of which were previewed at Google’s I/O 2018 developer conference last year.
Google Assistant product manager Brant Ward said that each voice was synthesized by a machine learning system — WaveNet — pioneered by Alphabet’s DeepMind. For the uninitiated, WaveNet mimics things like stress and intonation (referred to in linguistics as prosody) by identifying tonal patterns in speech. In addition to producing much more convincing speech snippets than previous AI models, it’s also more efficient. Running on Google’s tensor processing units (TPUs), or custom chips packed with circuits optimized for AI model training, a one-second voice sample takes just 50 milliseconds to create.
Amazon followed in DeepMind’s footsteps earlier this year with a neural text-to-speech model that enables Alexa to narrate snippets from Wikipedia more naturally, with a contextually sensitive voice. Nearly a dozen voices generated by the same model rolled out to Amazon Polly in July, following the addition of 38 new WaveNet-generated voices to Google’s Cloud Text-to-Speech service.
“[The new Google Assistant voices] … sound natural, with great pitch and pacing,” said Ward. “We’ve learned that people enjoy choosing between voices to find the one that sounds right to them.”
Eager to give them a go? Head to the Settings menu in the Google Assistant app for Android or iOS, where you’ll see a list of choices displayed by color as opposed to gender. (Google says the hues are intended to avoid influencing people’s voice selection with labels.) Alternatively, if you live in one of the countries receiving a new voice, either a “red” voice or “orange” voice will be randomly assigned when you first set up Google Assistant.
“A lot of people are surprised to learn that they don’t always stick with the voice they’ve been using, so give it a shot. You might just find one that sounds even better than the one you’ve been using,” said Ward.
Google Assistant’s new voices follow the rollout of Continued Conversation, a feature akin to Alexa’s Follow-Up Mode that listens for additional queries or follow-up questions after an initial exchange. In February, Google expanded multilingual support — which enables Google Assistant to recognize multiple languages in multiturn conversations — to Korean, Hindi, Swedish, Norwegian, Danish, and Dutch. In other news, Google introduced Interpreter Mode for translations in dozens of languages; announced a reduction in speech recognition errors of 29 percent; and detailed Duplex on the web, an evolution of Google’s Duplex chat agent that can handle things like rental car bookings and movie ticket purchases.