Cross-modal informational masking due to mismatched audio cues in a speechreading task

Douglas S. Brungart, Brian D. Simpson, Alex Kordik

Research output: Contribution to conferencePaperpeer-review


Although most known examples of cross-modal interactions in audio-visual speech perception involve a dominant visual signal that modifies the apparent audio signal heard by the observer, there may also be cases where an audio signal can alter the visual image seen by the observer. In this experiment, we examined the effects that different distracting audio signals had on an observer's ability to speechread a color and number combination from a visual speech stimulus. When the distracting signal was noise, time-reversed speech, or irrelevant continous speech, speechreading performance was unaffected. However, when the distracting audio signal was speech that followed the same general syntax as the target speech but contained a different color and number combination, speechreading performance was dramatically reduced. This suggests that the amount of interference an audio signal causes in a speechreading task strongly depends on the semantic similarity of the target and masking phrases. The amount of interference did not, however, depend on the apparent similarity between the audio speech signal and the visible talker: masking phrases spoken by a talker who was different in sex than the visible talker interfered nearly as much with the speechreading task as masking phrases spoken by the same talker used in the visual stimulus. A second experiment that examined the effects of desynchronizing the audio and visual signals found that the amount of interference caused by the audio phrase decreased when it was time advanced or time delayed relative to the visual target, but that time shifts as large as 1 s were required before performance approached the level achieved with no audio signal. The results of these experiments are consistent with the existence of a kind of cross-modal "informational masking" that occurs when listeners who see one word and hear another are unable to correctly determine which word was present in the visual stimulus.

Original languageEnglish
Number of pages4
StatePublished - 2003
Externally publishedYes
Event8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - Geneva, Switzerland
Duration: 1 Sep 20034 Sep 2003


Conference8th European Conference on Speech Communication and Technology, EUROSPEECH 2003


Dive into the research topics of 'Cross-modal informational masking due to mismatched audio cues in a speechreading task'. Together they form a unique fingerprint.

Cite this