I find the whole topic of assistive tech fascinating, as you may have gathered by now. I know a lot more about assistive tech for the visually impaired than the other fields, and I’m trying to push myself to expand my horizons. At the CSUN Assistive Tech conference, I saw a talk that was way outside my knowledge, entitled “Interaction with Home Assistants for People with Dysarthria” presented by PhD candidate Aisha Jaddoh at Cardiff University / School of Computer Science and Informatics.
According to the Mayo Clinic:
Dysarthria occurs when the muscles you use for speech are weak or you have difficulty controlling them. Dysarthria often causes slurred or slow speech that can be difficult to understand.
Common causes of dysarthria include nervous system disorders and conditions that cause facial paralysis or tongue or throat muscle weakness. Certain medications also can cause dysarthria.
After learning the definition of dysarthria, let’s talk about the problem Ms. Jaddoh is hoping to solve.
We’ve all experienced the difficulties with so-called “smart” assistants when they don’t understand what we’re trying to say to them. I learned from listening to Ms. Jaddoh’s talk that It is ever so much harder for people who cannot produce clear speech. This lack of clarity can be because their speech is too slow, or they have to breathe more often in their speech patterns and have trouble sustaining long sentences.
Ms. Jaddoh’s research is on whether an intermediary device could basically be a translator between the human and the home assistant. Her proposal is to use what she called “non-verbal voice cues”, which is any non-word sound that can be produced by a person.
Her theory is that if a person with dysarthria can make some sounds, such as a hum, whistle, or a vowel sound, perhaps a dictionary of phrases could be produced that would be translated into questions for the voice assistant. She suggested the example of a person saying “ah” and that means “play the news”, or “æi” it could mean “what’s the weather?”
Ms. Jaddoh is proposing to train a Raspberry Pi to listen for the non-verbal voice cue and translate it to the correct command. To preserve the privacy of the user, the Raspberry Pi will pass the command as text to the cloud service for Amazon or Google. The cloud service would send the command to the assistant which would perform the requested action.
She said that the non-verbal cues will be decided based on dysarthric phonetic features and articulation capabilities in addition to target users’ input. I’m a little unclear on that part. It sounded like she was saying that there is a set of non-verbal cues that most people with dysarthria can make. I have done zero studies of dysarthria (heck, I just learned the word this week), but from the description of the causes of dysarthria, doesn’t it seem unlikely that people can make a common set of sounds? She did say they would be interviewing people with dysarthria to learn the sounds they can make. I’m sure she knows what she’s doing but I’d sure like to know more if anyone has more information on this.
Ms. Jaddoh didn’t describe how the training of the Raspberry Pi would occur. In a finished product, I envision that you’d have an interface to the Raspberry Pi where you make a sound, and then type in what you want that sound to represent. When you think about it, even without dysarthria, wouldn’t you rather just yell “BAH!” at your home assistant instead of, “Hey A-Lady, turn on the basement lights”? At any given time, one of my smart devices has a name I simply can’t remember, so I think I’d really like to just make phonetic sounds instead.
Ms. Jaddoh’s work is still a study at this point to see if the concept will work, but I like how she’s targeting a really important problem to be solved.