Innovation > Innovation
MRM//McCANN, Frankfurt / GERMAN YOUTH ASSOCIATION OF PEOPLE WITH HEARING LOSS / 2019
Awards:
Overview
Credits
Why is this work relevant for Innovation?
SIGNS is the first smart voice assistant solution for people with hearing loss worldwide. It’s an innovative smart tool that recognizes and translates sign language in real-time and then communicates directly with a selected voice assistant service (e.g. Amazon Alexa, Google Assistant or Microsoft Cortana). SIGNS is reinventing voice – one gesture at a time. This smart tool is the interface for all non-verbal communication with voice assistants – today and in the future.
Background
SIGNS is based on an intelligent machine learning framework (Google TensorFlow) that is trained to identify body gestures with the help of an integrated camera. These gestures are converted into a data format that the selected voice assistant service understands. For now, SIGNS can bring hearing impaired people and voice assistants together by offering some of the most used commands on Amazon Alexa, for example, it can put milk on the shopping list, it can change the color of smart home lights, and it can show the weather. All in all, it understands a limited set of signs in German sign language. However, it can learn a new set of signs in only minutes and can be adapted to any sign language.
Describe the idea
There are over 2 billion voice-enabled devices across the globe. Voice assistants are changing the way we shop, search, communicate or even live. At least for most people. But what about those without a voice? What about those who cannot hear? According to the World Health Organization around 466 million people worldwide have disabling hearing loss. Project SIGNS was developed to create awareness for inclusion in the digital age as well as to facilitate access to new technologies. Many people with hearing loss use their hands to speak. This is their natural language. Their hands are their voice. However, voice assistants use natural language processing to decipher and react only to audible commands. No sound means no reaction. The SIGNS prototype bridges the gap between deaf people and voice assistants, by recognizing gestures to communicate directly with existing voice assistant services (e.g. Amazon Alexa, Google Home or Microsoft Cortana).
What were the key dates in the development process?
06/18 – 07/18 Ideation / Conceptualization
07/18 – 08/18 Development of an experience prototype based on infra-red camera and hand gesture recognition
08/18 – 09/18 Experience Testing
10/18 – 12/18 Conceptualization for the next-gen prototype with full body gesture recognition based on RGB camera
01/19 Interviews / user acceptance testing with deaf participants
01/19 – 04/19 Design & Development of the next-gen prototype
04/19 Premiere of SIGNS at the Conversational Design Event Frankfurt
04/19 – 05/19 Experience Testing with a focus group of deaf participants
Describe the innovation/technology
SIGNS uses an integrated camera to recognize sign language in real-time and communicates directly with a voice assistant. The system is based on the machine learning framework Google Tensorflow. The result of the pre-trained MobileNet is used to train several KNN classifiers on gestures. The recognition calculates the likelihood of the webcam's recorded gestures and converts into text. The resulting sentences are translated into conventional grammar and sent to a cloud-based service that generates language from it. In other words, the gestures are converted into a data format (text to speech) that the selected voice assistant understands. In this case, shown Amazon Voice Service (AVS). AVS responds with meta and audio data, which in turn is converted from a cloud service to text (text to speech). The result is displayed. SIGNS works on any browser-based operating system that has an integrated camera and can be connected to a voice assistant.
Describe the expectations/outcome
The goal is to make SIGNS available on all assistants and to all hearing-impaired people. In the first release, Q4 SIGNS will launch on Windows and MacOS with Amazon Alexa connectivity and limited gestures. By Q2 / 20, the connector to Google Assistant is planned, Q3 / 20 Microsoft Cortana. By Q4 / 20 there will be a crowd-based dictionary with which the community can contribute to the vocabulary. According to Gartner 30% of all digital interactions will be non-screen based by 2020. Just like voice, gestures are an intuitive way of communicating, making it extremely relevant for the industry. Not just for the hearing impaired, but for everyone. People think it is awkward to speak to the invisible in public, that’s why we believe that invisible conversational interactions with the digital world are not limited to voice itself.
More Entries from Early Stage Technology in Innovation
24 items
More Entries from MRM//McCANN
24 items