TY - CONF
T1 - Spectro-temporal interactions in auditory and auditory-visual speech processing
AU - Grant, Ken W.
AU - Greenberg, Steven
N1 - Funding Information:
This research was supported by grant numbers DC 000792-01A1 from the National Institute on Deafness and Other Communication Disorders to Walter Reed Army Medical Center and SBR 9720398 from the Learning and Intelligent Systems Initiative of the National Science Foundation to the International Computer Science Institute. The opinions or assertions contained herein are the private views of the authors and should not be construed as official or as reflecting the views of the Department of the Army or the Department of Defense.
PY - 2003
Y1 - 2003
N2 - Speech recognition often involves the face-to-face communication between two or more individuals. The combined influences of auditory and visual speech information leads to a remarkably robust signal that is greatly resistant to noise, reverberation, hearing loss, and other forms of signal distortion. Studies of auditoryvisual speech processing have revealed that speech-reading interacts with audition in both the spectral and temporal domain. For example, not all speech frequencies are equal in their ability to supplement speech-reading, with low-frequency speech cues providing more benefit than high-frequency speech cues. Additionally, in contrast to auditory speech processing which integrates information across frequency over relatively short time windows (20- 40 ms), auditory-visual speech processing appears to use relatively long time windows of integration (roughly 250 ms). In this paper, some of the basic spectral and temporal interactions between auditory and visual speech channels are enumerated and discussed.
AB - Speech recognition often involves the face-to-face communication between two or more individuals. The combined influences of auditory and visual speech information leads to a remarkably robust signal that is greatly resistant to noise, reverberation, hearing loss, and other forms of signal distortion. Studies of auditoryvisual speech processing have revealed that speech-reading interacts with audition in both the spectral and temporal domain. For example, not all speech frequencies are equal in their ability to supplement speech-reading, with low-frequency speech cues providing more benefit than high-frequency speech cues. Additionally, in contrast to auditory speech processing which integrates information across frequency over relatively short time windows (20- 40 ms), auditory-visual speech processing appears to use relatively long time windows of integration (roughly 250 ms). In this paper, some of the basic spectral and temporal interactions between auditory and visual speech channels are enumerated and discussed.
UR - http://www.scopus.com/inward/record.url?scp=85009170709&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85009170709
SP - 2557
EP - 2560
T2 - 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003
Y2 - 1 September 2003 through 4 September 2003
ER -