Innovation > Innovation

HEAR MY LOVE

F5, Shanghai / WEBANK / 2018

CampaignCampaign(opens in a new tab)
Presentation Image
Demo Film

Overview

Credits

Overview

CampaignDescription

Left-behind children grow up without parental support and lack love and emotional needs.

Over time, their minds tend to become sensitive and introverted, forming a series of psychological problems such as inferiority and self-blame.

Simulating the parents' voice by artificial intelligence enables left behind children to feel their parents' love.

So WeBank, as China's most innovative bank, decided to solve the problem with the innovative thinking.

WeBank created an AI speaker that mimics the parent's voice to tell bedtime stories to left-behind children.

Execution

First, we recorded speech fragments from the parent.

Then we fed them into our deep neural networks, creating a Text-to-Speech system.

The system can tell any story in the voice of its originator, provided the story text is available.

Lastly, we built a speaker and install the system in it.

Outcome

Fully supported by China's tech giant Tencent, WeBank targets to help 3000 more left-behind children in 2018.

Relevancy

Firstly, Hear My Love is an innovative solution targeting to solve a serious social problem in China, helping over 60 million left-behind children to have a better mental condition.

Secondly, we used a new approach (End-to-End) to create a speech synthesis system, which is able to generate speech that mimics any human voice that sounds more natural than the best existing Text-to-Speech systems.

Lastly, Hear My Love is fully supported by the Wechat iHearing team for speech synthesis, Tencent's InnovationLab for speaker development, WeBank for financial resources. All under or related to China's tech giant Tencent.

Solution

16th November 2017: Hear My Love project kicked off.

10th January 2018: Father's speech fragments collected.

26th January 2018: Father's speech synthesis system ready.

14th February 2018: A.I speaker prototype with built-in speech synthesis system ready.

6th April 2018: First A.I speaker delivered to left behind children Li Jiali who lived in Lijiang County, Yunnan province, a distance of 2900km to her father's workplace in Shanghai.

Synopsis

We created a neural voice cloning system that takes a few audio samples as input.

We used two approaches build their neural cloning system: speaker adaptation and speaker encoding.

Speaker adaptation is based on fine-tuning a multi-speaker generative model with a few cloning samples. Speaker encoding is based on training a separate model to directly infer a new speaker embedding from cloning audios and to be used with a multi-speaker generative model.

In terms of the naturalness of the speech and its similarity to original speaker, both approaches can achieve good performance, even with very few cloning audios.

While speaker adaptation can achieve better naturalness and similarity, the cloning time or required memory for the speaker encoding approach is significantly less, making it favorable for low-resource deployment.

More Entries from Applied Innovation in Innovation

24 items

Grand Prix Cannes Lions
MY LINE

Innovative Technology

MY LINE

MINISTRY OF COMMUNICATIONS & INFORMATION, MULLENLOWE SSP3

(opens in a new tab)

More Entries from F5

24 items

Silver Cannes Lions
KNOW YOU AGAIN

Data Driven Consumer Product

KNOW YOU AGAIN

BAIDU, F5

(opens in a new tab)