Abstract
The current study examines the temporal parameters associated with cross-modal integration of auditory-visual information for sentential material. The speech signal was filtered into 1/3-octave channels, all of which were discarded except for a low-frequency (298-375 Hz) and a high-frequency (4762-6000 Hz) band. The intelligibility of this audio-only signal ranged between 9% and 31% for nine normal-hearing subjects. Visual-alone presentation of the same material ranged between 1% and 22% intelligibility. When the audio and video signals are combined and presented in synchrony, intelligibility climbs to an average of 63%. When the audio signal leads the video, intelligibility declines appreciably for even the shortest asynchrony of 40 ms. Additional increases in video delay result in a progressive decline in intelligibility, reaching a level comparable to that of the audio-alone condition for an asynchrony of 400 ms. In contrast, when the video signal leads the audio, intelligibility remains relatively stable for onset asynchronies up to 160-200 ms. Hence, there is a marked asymmetry in the integration of audio and visual information that has important implications for sensory-based models of auditory-visual speech processing.
Original language | English |
---|---|
Pages | 132-137 |
Number of pages | 6 |
State | Published - 2001 |
Externally published | Yes |
Event | 2001 International Conference on Auditory-Visual Speech Processing, AVSP 2001 - Aalborg, Denmark Duration: 7 Sep 2001 → 9 Sep 2001 |
Conference
Conference | 2001 International Conference on Auditory-Visual Speech Processing, AVSP 2001 |
---|---|
Country/Territory | Denmark |
City | Aalborg |
Period | 7/09/01 → 9/09/01 |