This study examined the perceptual processing of time-gated auditory-visual (AV), auditory (A), and visual (V) spoken words. The primary goal was to assess the extent to which stimulus information versus perceptual processing limitations underlie modality-related perceptual encoding speed differences in AV, A, and V spoken word recognition. Another goal was to add to the scant literature on the comparative time-course of phonetic information in AV, A, and V spoken words . In terms of duration of speech signal required for accurate word identification, it was found that AV<A<V. For individual word stimuli, there were strong predictive relations between unimodal encoding speed and gating measures. Perceptual encoding of V words is slower than predicted based on stimulus information alone.