Monaural speech segregation using synthetic speech signals

Douglas S. Brungart*, Nandini Iyer, Brian D. Simpson

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

When listening to natural speech, listeners are fairly adept at using cues such as pitch, vocal tract length, prosody, and level differences to extract a target speech signal from an interfering speech masker. However, little is known about the cues that listeners might use to segregate synthetic speech signals that retain the intelligibility characteristics of speech but lack many of the features that listeners normally use to segregate competing talkers. In this experiment, intelligibility was measured in a diotic listening task that required the segregation of two simultaneously presented synthetic sentences. Three types of synthetic signals were created: (1) sine-wave speech (SWS); (2) modulated noise-band speech (MNB); and (3) modulated sine-band speech (MSB). The listeners performed worse for all three types of synthetic signals than they did with natural speech signals, particularly at low signal-to-noise ratio (SNR) values. Of the three synthetic signals, the results indicate that SWS signals preserve more of the voice characteristics used for speech segregation than MNB and MSB signals. These findings have implications for cochlear implant users, who rely on signals very similar to MNB speech and thus are likely to have difficulty understanding speech in cocktail-party listening environments.

Original languageEnglish
Pages (from-to)2327-2333
Number of pages7
JournalJournal of the Acoustical Society of America
Volume119
Issue number4
DOIs
StatePublished - Apr 2006
Externally publishedYes

Fingerprint

Dive into the research topics of 'Monaural speech segregation using synthetic speech signals'. Together they form a unique fingerprint.

Cite this