GazeSpeak & Microsoft’s ongoing efforts to support people with Motor Neuron Disease (ALS)

Last Friday (February 17th) New Scientist published an article about a new app in development at Microsoft called GazeSpeak. Due to be released over the coming months on iOS, GazeSpeak aims at facilitating communication between a person with MND (known as ALS in the US, I will use both terms interchangeably) and another individual, perhaps their partner, carer or friend. Developed by Microsoft intern, Xiaoyi Zhang, GazeSpeak differs from traditional approaches in a number of ways. Before getting into the details however it’s worth looking at the background, GazeSpeaker didn’t come from nowhere, it’s actually one of the products of some heavyweight research into Augmentative and Alternative Communication (AAC) that has been taking place at Microsoft over the last few years. Since 2013, inspired by football legend and ALS sufferer Steve Gleason (read more here) Microsoft researchers and developers have put the weight of their considerable collective intellect to bear on the subject of increasing the ease and efficiency of communication for people with MND.

Last year Microsoft Research published a paper called ”
AACrobat: Using Mobile Devices to Lower Communication Barriers and Provide Autonomy with Gaze-Based AAC” (abstract and pdf download at previous link) which proposed a companion app to allow an AAC user’s communication partner assist (in an non-intrusive way) in the communication process. Take a look at the video below for a more detailed explanation.

This is an entirely new approach to increasing the efficiency of AAC and one that I suggest, could only have come from a large mainstream tech organisation who have over thirty years experience facilitating communication and collaboration.

Another Microsoft research paper published last year (with some of the same authors at the previous paper) called “Exploring the Design Space of AAC Awareness Displays” looks at importance of a communication partners “awareness of the subtle, social, and contextual cues that are necessary for people to naturally communicate in person”. There research focused on creating a display that would allow the person with ALS express things like humor, frustration, affection etc, emotions difficult to express with text alone. Yes they proposed the use of Emoji, which are a proven and effective way a similar difficulty is overcome in remote or non face to face interactions however they went much further and also looked at solutions like Avatars, Skins and even coloured LED arrays. This, like the other one above, is an academic paper and as such not an easy read but the ideas and solutions being proposed by these researchers are practical and will hopefully be filtering through to end users of future AAC solutions.

That brings us back to GazeSpeak, the first fruits of the Microsoft/Steve Gleason partnership to reach the general public. Like the AACrobat solution outlined above GazeSpeak gives the communication partner a tool rather than focusing on tech for the person with MND. As the image below illustrates the communication partner would have GazeSpeak installed on their phone and with the app running they would hold their device up to the person with MND as if they were photographing them. They suggest a sticker with four grids of letters is placed on the back of the smart phone facing the speaker. The app then tracks the persons eyes: up, down, left or right, each direction means the letter they are selecting is contained in the grid in that direction (see photo below).

man looking right, other person holding smartphone up with gazespeak installed

Similar to how the old T9 predictive text worked, GazeSpeak selects the appropriate letter from each group and predicts the word based on the most common English words. So the app is using AI in the form of machine vision to track the eyes and also to make the word prediction. In the New Scientist  article they mention that the user would be able to add their own commonly used words and people/place names which one assumes would prioritize them within the prediction list. In the future perhaps some capacity for learning could be added to further increase efficiency. After using this system for a while the speaker may not even need to see the sticker with letters, they could write words from muscle memory. At this stage a simple QR code leading to the app download would allow them to communicate with complete strangers using just their eyes and no personal technology.

Leave a Reply

Your email address will not be published. Required fields are marked *