TY - JOUR
T1 - Detection of auditory (cross-spectral) and auditory-visual (cross-modal) synchrony
AU - Grant, Ken W.
AU - Van Wassenhove, Virginie
AU - Poeppel, David
N1 - Funding Information:
This research was supported by the Clinical Investigation Service, Walter Reed Army Medical Center, under Work Unit #00-2501 and by grant numbers DC 000792-01A1 from the National Institute on Deafness and Other Communication Disorders to Walter Reed Army Medical Center, SBR 9720398 from the Learning and Intelligent Systems Initiative of the National Science Foundation to the International Computer Science Institute, and DC 004638-01 and DC 005660-01 from the National Institute on Deafness and Other Communication Disorders to the University of Maryland. A preliminary report of this work was presented at the International Speech Communication Association (ISCA) Tutorial and Research Workshop on Audio Visual Speech Processing (AVSP), St Jorioz France, 4–7 September, 2003. We would like to thank Dr. Steven Greenberg and Dr. Van Summers for their support and many fruitful discussions concerning this work. The opinions or assertions contained herein are the private views of the authors and should not be construed as official or as reflecting the views of the Department of the Army or the Department of Defense.
PY - 2004/10
Y1 - 2004/10
N2 - Detection thresholds for temporal synchrony in auditory and auditory-visual sentence materials were obtained on normal-hearing subjects. For auditory conditions, thresholds were determined using an adaptive-tracking procedure to control the degree of temporal asynchrony of a narrow audio band of speech, both positive and negative in separate tracks, relative to three other narrow audio bands of speech. For auditory-visual conditions, thresholds were determined in a similar manner for each of four narrow audio bands of speech as well as a broadband speech condition, relative to a video image of a female speaker. Four different auditory filter conditions, as well as a broadband auditory-visual speech condition, were evaluated in order to determine whether detection thresholds were dependent on the spectral content of the acoustic speech signal. Consistent with previous studies of auditory-visual speech recognition which showed a broad, asymmetrical range of temporal synchrony for which intelligibility was basically unaffected (audio delays roughly between -40ms and +240 ms), auditory-visual synchrony detection thresholds also showed a broad, asymmetrical pattern of similar magnitude (audio delays roughly between -45ms and +200 ms). No differences in synchrony thresholds were observed for the different filtered bands of speech, or for broadband speech. In contrast, detection thresholds for audio-alone conditions were much smaller (between -17ms and +23ms) and symmetrical. These results suggest a fairly tight coupling between a subject's ability to detect cross-spectral (auditory) and cross-modal (auditory-visual) asynchrony and the intelligibility of auditory and auditory-visual speech materials. Published by Elsevier B.V.
AB - Detection thresholds for temporal synchrony in auditory and auditory-visual sentence materials were obtained on normal-hearing subjects. For auditory conditions, thresholds were determined using an adaptive-tracking procedure to control the degree of temporal asynchrony of a narrow audio band of speech, both positive and negative in separate tracks, relative to three other narrow audio bands of speech. For auditory-visual conditions, thresholds were determined in a similar manner for each of four narrow audio bands of speech as well as a broadband speech condition, relative to a video image of a female speaker. Four different auditory filter conditions, as well as a broadband auditory-visual speech condition, were evaluated in order to determine whether detection thresholds were dependent on the spectral content of the acoustic speech signal. Consistent with previous studies of auditory-visual speech recognition which showed a broad, asymmetrical range of temporal synchrony for which intelligibility was basically unaffected (audio delays roughly between -40ms and +240 ms), auditory-visual synchrony detection thresholds also showed a broad, asymmetrical pattern of similar magnitude (audio delays roughly between -45ms and +200 ms). No differences in synchrony thresholds were observed for the different filtered bands of speech, or for broadband speech. In contrast, detection thresholds for audio-alone conditions were much smaller (between -17ms and +23ms) and symmetrical. These results suggest a fairly tight coupling between a subject's ability to detect cross-spectral (auditory) and cross-modal (auditory-visual) asynchrony and the intelligibility of auditory and auditory-visual speech materials. Published by Elsevier B.V.
KW - Auditory-visual speech processing
KW - Cross-modal asynchrony
KW - Spectro-temporal asynchrony
UR - http://www.scopus.com/inward/record.url?scp=10444249633&partnerID=8YFLogxK
U2 - 10.1016/j.specom.2004.06.004
DO - 10.1016/j.specom.2004.06.004
M3 - Article
AN - SCOPUS:10444249633
SN - 0167-6393
VL - 44
SP - 43
EP - 53
JO - Speech Communication
JF - Speech Communication
IS - 1-4 SPEC. ISS.
ER -