Reducing viseme confusion in speech-reading

Benjamin M. Gorman

ACM SIGACCESS Accessibility and Computing


Speech-reading is an invaluable technique for people with hearing loss or those in adverse listening conditions (e.g., in a noisy restaurant, near children playing loudly). However, speech-reading is often difficult because identical mouth shapes (visemes) can produce several speech sounds (phonemes); there is a one-to-many mapping from visemes to phonemes. This decreases comprehension, causing confusion and frustration during conversation. My doctoral research aims to design and evaluate a visualisation technique that displays textual representations of a speaker’s phonemes to a speech-reader. By combining my visualisation with their pre-existing speech-reading ability, speech-readers should be able to disambiguate confusing viseme-to-phoneme mappings without shifting their focus from the speaker’s face. This will result in an improved level of comprehension, supporting natural conversation.

This article is an extended version of my research which I presented at the 2015 ASSETS Doctoral Consortium held in Lisbon, Portugal.

View this paper on the ACM Digital Library