Linguistic Laboratory for Speech Prosody

University  of  Illinois  at  Urbana-Champaign


Text Only

We conduct corpus studies and experiments to investigate prosody in English and other languages. Our projects examine prosody perception and production in spontaneous and read speech. We also work on methods for prosodic annotation by humans, and automatic prosody detection by machines.

Current projects

Constituents and Heads in Prosody Perception

NSF BCS 12-51343; 2013-2016
Investigators: Cole (PI), Hualde (co-PI, UIUC), Caroline Smith (co-PI, U New Mexico); Mahrt, Eager, Im
Publications: 68, 70, 73

This project investigates the relationship between prosodic phrasing (constituents) and prosodic prominence (heads) in a comparative study of English, Spanish and French. These languages are known to differ not only in their prosody (eg., intonation and rhythmic patterns), but also in the associations linking prosody to syntax and semantics. Experiments using the Rapid Prosody Transcription method, developed in our prior work to investigate prosody perception among ordinary (untrained, non-expert) listeners, show how listeners perceive the prosodic phrasing and prominence patterns of an utterance when presented with speech samples that differ in their phonetic properties (pitch and timing), and in syntactic and semantic features. Findings shed light on the interplay of acoustic cues and top-down features from the syntactic, semantic and discourse context in the perception of prosody in these languages, and will contribute to our understanding of cross-linguistic variation in the role prosody plays in conveying linguistic meaning.

Prosodic category structure

This project is a continuation of our research begun under NSF IIS 07-03624
Investigators: Cole (PI), Hasegawa-Johnson (co-PI, UIUC), Mahrt
Publications: 56, 59, 61, 71

Acoustic correlates of perceived prominence are found in measures of pitch, duration, and intensity in American English. This project investigates whether these acoustic correlates cue a binary prominence distinction (prominent vs. non-prominent) or a gradient prominence distinction (low-to-medium-to-high prominence). We further investigate whether all acoustic correlates cue the same pattern of prominence distinctions for a given speaker, and the extent of individual differences in how prominence distinctions are acoustically cued (by speakers) and perceived (by listeners).

Prosodic annotation with Rapid Prosody Transcription

This project is a continuation of our research begun under NSF IIS 07-03624
Investigators: Cole (PI), Mo, Yoon, Lee
Publications: 10, 29, 34, 37, 64-68, 70

Research on the form and function of prosody requires analysis of prosodically annotated speech materials, so an important focus of our work is on the development of a method for prosodic annotation that is robust to variability in prosodic expression due to speech style, individual speaker differences, and other contextual factors. Towards this goal we have developed Rapid Prosody Transcription (RPT), and we are investigating the relationship between prosodic annotation by untrained listeners using RPT and by expert transcribers using more complex prosodic annotations. We have used RPT to investigate prosody at the word and phrase levels in English, Spanish, Russian (Publications 64, 65) and Hindi (Publications 67).

Variation in plosive production: rate and prosodic factors

Investigators: Khasanova, Cole, Hasegawa-Johnson
Publications: 72; Khasanova, A., Ph.D. thesis (2013)

Tihs project is a development from former lab member Alina Khasanova's 2013 doctoral thesis, which variability in the acoustic realization of plosive consonants in American English through analysis of data from the Buckeye Corpus of conversational speech. This work includes the implementation of a burst detector, used for the precise localization of bursts in phone-labeled speech.

Prosodic factors influencing reduction in spontaneous speech

Investigators: Cole, Shattuck-Hufnagel (M.I.T.)
Publications: 55, 60

Prosodic context influences speech production and conditions strengthening and weakening effects on segments and syllables. This project investigates the relationship between prosodic context and patterns of phonetic strengthening and reduction in a laboratory task using speech imitation.

Past projects

Landmark-based robust speech recognition using prosody-guided models of speech variability

NSF IIS 07-03624; 2007-2010, extended to 2012
Investigators: Cole (PI), Hasegawa-Johnson (co-PI, UIUC), Mo, Baek
Publications: 36, 37, 39-48, 50-54, 56, 57, 59, 61, 63

This project examines prosody from the dual perspectives of the speaker and listener. We investigate how listeners perceive the prosodic features of conversational speech, and the relative contributions of acoustic cues and top-down factors (from lexical, syntactic and discourse context) to prosody perception. Our research on these topics was part of a collaborative project led by Carol Espy-Wilson (PI, U Maryland), with Abeer Alwan (UCLA), Louis Goldstein (USC, Haskins Laboratories), Mary Harper (U Maryland),and Elliot Saltzman (Boston U). Our contributions relate to the broad project goals of developing acoustic landmark detectors and pattern classifiers for prosodic features and developing a model of the mapping from articulatory gestures that implement prosody to the acoustic output.

Acoustic correlates of prosody in broadcast news speech vs. spontaneous conversational speech

NSF IIS-0414117; 2006-2008; University of Illinois Critical Research Initiative; 2002-2003
Investigators: Cole (co-PI), Hasegawa-Johnson (PI, UIUC), Chavarria, H. Choi, J. Choi, Kim, Mo, Yoon
Publications: 1, 2, 6, 9, 12, 16, 17, 18, 21, 22, 23, 24, 25, 27, 33, 34

This project compares prosodic structures and the acoustic features that cue prosody in read and spontaneous speech based on two corpora: the Boston University Radio News corpus of read speech and the Switchboard corpus of spontaneous telephone conversational speech.

Disfluency and prosody

NSF IIS-0414117; 2006-2008
Investigators: Hasegawa-Johnson (PI, UIUC), Cole (co-PI), Shih (Co-PI, UIUC), Borys, Kim, Mo, Yoon
Publications: 13, 15, 19, 30, 31, 38

Prosody research that uses spontaneous speech data, such as the Switchboard corpus, must deal with the effect of disfluency on prosodic structure. This project explores the acoustic features that cue disfluency regions in running speech and the interaction of disfluency and prosodic structure on segmental and supra-segmental acoustic features. Work on this project includes the analysis of acoustic cues to glottalization that often marks prosodic structure.

Prosody-dependent automatic speech recognition

NSF IIS-0414117; 2006-2008
University of Illinois Critical Research Initiative, 2002-2003
Investigators: Hasegawa-Johnson (PI, UIUC), Cole (co-PI), Borys, Chavarrķa, Chen, J. Choi, H. Choi, Cohen, Kim, Yoon
Publications: 3, 4, 5, 7, 8, 11, 14, 20, 26, 28, 32

This project seeks to describe the interaction between prosodic structure and phoneme structure in conditioning acoustic variation in natural continuous speech. Our approach combines linguistic phonetic analysis and probabilistic speech recognition models to identify prosodic effects. This research has succeeded in demonstrating that the use of prosody can lead to improved word recognition accuracy in a large-vocabulary speech recognition experiment. A second goal of this research is the creation of techniques for the automatic labeling of prosodic structure and disfluency in speech.

Prosody and bilingualism

Publications: Puri, V., Ph.D. thesis (2013)

Former lab member Vandana Puri (PhD, 2013) investigated the intonation and prosodic systems of Indian English and Hindi spoken by late and simultaneous bilinguals in Delhi, India. Analysis of F0 contours of pitch accents in the two languages reveals that late bilinguals of Hindi and English maintain one intonational grammar for both their languages, whereas simultaneous bilinguals use a combination of the Hindi pitch accent (L*H) and the most common British pitch accent (H* or H*L) within the same intonational phase.




