Google's DeepMind Teaches Computers How to Speak Human

...

Apple's Siri personal assistant is getting a lot smarter in the upcoming iOS 10, but odds are she'll still sound like a computer. Meanwhile, a subsidiary of Google (her creator's rival) is working on an entirely new model for teaching computers to convert text to speech.

It's called WaveNet, and Google says it can mimic any human voice while sounding more natural than text-to-speech algorithms available today.

WaveNet is based on research from DeepMind, which this week offered an in-depth look at its efforts to synthesize audio signals for more natural-sounding artificial voices. It all starts with convolutional neural networks, the same technology that powers everything from self-driving cars to disease detection.

Neural networks also now power some current text-to-speech products, including Siri, which two years ago was rebuilt to take advantage of this form of machine learning. But Siri and her colleagues, like Google Voice Search or Amazon's Alexa, still use a database of short speech fragments that are strung together to form complete words and sentences. The result is a halting, emotionless voice, even if it is understandable.

What if instead of using speech fragments, there was a way to efficiently compile pure audio waveforms? Not only would that allow for more natural-sounding speech, but it would also let the computer mimic virtually any sound, including the ability to faithfully reproduce music. DeepMind engineers set to work.

At first, they waged an uphill battle thanks to the inherent density of raw audio, which requires more than 16,000 samples a second for a computer to process. But the engineers were at last able to build a neural network that uses real waveforms from human speakers. They sampled each recording to create a probability distribution of utterances—in essence, teaching the computer how to speak like a human.

"Building up samples one step at a time like this is computationally expensive," according to DeepSense, "but we have found it essential for generating complex, realistic-sounding audio."

The result is remarkable. DeepSense provided samples of its speech capabilities alongside those typically used today, and the difference in inflection, tone, and emotion is immediately apparent. Have a listen for yourself.

It's only natural that computers' speech synthesis will become more, well, natural: Google and its competitors have invested significant resources in developing personal assistants. In order for them to catch on, humans need to think of them less as a gimmick and more as articulate, pleasant robots.

Categories
GAMES
0 Comment

Leave a Reply

Captcha image


RELATED BY

  • 5300c769af79e

    Pokémon GO Public Launch for Android Happening Next Month

    Even though I still don’t have a beta invite (thanks a lot, Niantic), the public launch of Pokémon GO for Android will take place next month in July.During Nintendo’s E3 livestream, it was announced that the Pokémon GO Plus hardware, which allows gamers to play the game without needing to take your smartphone from your pocket, will launch for $34.
  • 5300c769af79e

    Force-Quitting iOS Apps Saves Battery, Right? Nope

    Craig Federighi this week responded to a customer's email about exiting background applications—those programs still running but not currently in use.And while the practice has long been deemed futile, Federighi's Monday message is basically the only thing the company has said about the debate in six years.
  • 5300c769af79e

    Dynamics 365 Unveiled, Skype Meetings Launched: Microsoft Roundup

    This week Microsoft launched Skype meetings for small businesses and combined its CRM and ERP offerings into a cloud service called Dynamics 365.On July 6, Microsoft announced plans to combine its cloud CRM and ERP services into a single offering called Microsoft Dynamics 365.
  • 5300c769af79e

    Playstation Vue Lands on Android TV!

    I know what you are thinking, “Why the excitement over an Android TV app?That means the full Playstation Vue experience like you would get on a Fire TV only through a Nexus Player or SHIELD TV or Android TV-equipped TV.