|
USC Phonetics and Phonology Group |
|||||
homefacultycourseseventsphonetics
|
Open USC Phonetics and Phonology Group Home in new window.Research Projects at the USC Phonetics LaboratoryContents:
Prosody and Articulatory Dynamics in Spoken LanguageDani Byrd, Shrikanth Narayanan and Sungbok Lee (EE & Ling)with Elliot Saltzman (Haskins Laboratories & Boston University) and Rebeka Campos, Susie Choi, Jelena Krivokapic, and Daylen RiggsFunding: NIH DC03172The long term objective of the proposed research program is to understand how linguistic structure conditions the spatiotemporal realization of articulatory movement during speaking. As research in speech production becomes more integrated with linguistic theory, it has become increasingly clear that segmental articulation cannot be understood independently of prosodic structure. Such structure includes, but is not limited to, prominence and phrasal organization, and effects of these high-level prosodic aspects of linguistic structure pervade low-level articulatory behavior. However, despite the pervasiveness of these effects, only a very few prosodic signatures have been identified at the level of articulatory patterning. This research program investigates the relation between one aspect of prosodic structure—phrasal structure—and the control and coordination of articulation within a dynamical systems model of speech production. The specific aim of this proposal is to understand how speakers modulate the spatiotemporal organization of oral articulatory gestures as a function of their phrasal positions. A series of studies are described that fall into three areas: the kinematic characteristics of speech gestures in the vicinity of phrasal junctures, the categorical versus gradient nature of those junctures as manifested in articulation, and computational modeling of the systematic variability in articulation that occurs at phrase edges. The specific aims will be pursued using articulatory movement data collected with a magnetometer system and by elaboration of the well-known Task Dynamic computational model of speech production. Please see Dani Byrd's research statement for more detail.D. Byrd & E. Saltzman. (1998) Intragestural dynamics of multiple phrasal boundaries. Journal of Phonetics, 26:173-199. [pdf] D. Byrd, A. Kaun, S. Narayanan, & E. Saltzman. (2000) Phrasal signatures in articulation. In M. B. Broe and J. B. Pierrehumbert, (Eds.). Papers in Laboratory Phonology V. Cambridge:Cambridge University Press, 70 - 87. [pdf]. D. Byrd & E. Saltzman. (2003) The elastic phrase:
Modeling the dynamics of boundary-adjacent lengthening. Journal of
Phonetics, 31,2, 149-180. [pdf protected by
Academic Press] [alternative pdf] D. Byrd, S. Lee, D.
Riggs, and J. Adams. (2005) Interacting effects of syllable and phrase
position on consonant articulation. Journal of the Acoustical Society
of America 118(6), 3860-3873. [pdf] This is our magnetometer team: Narayanan, Byrd, Lee, and (on right) Mah
SPAN: Speech Production and Articulation
kNowledge
Group
|
||||
|
|
We explore the puzzling juxtaposition of underlying invariance of control and surface variability in performance during speech production, and outline how a dynamical systems approach can contribute to solving this puzzle. Articulatory patterning at phrase edges is used as an example of how the surface expression of underlyingly invariant phonological units can vary in a linguistically principled way. We use computational simulations of these phrase boundary effects as prosodically-induced local temporal slowing. This slowing is generated by dynamical effects on the parameter specification of articulatory gestures. This focus allows us to examine a specific view of how underlying temporal characteristics of linguistic units can be modulated for communicative ends in the production of a particular utterance.
D. Byrd & E. Saltzman. (2003) The elastic phrase: Modeling the dynamics of boundary-adjacent lengthening. Journal of Phonetics, 31,2, pp 149-180. [pdf protected by Academic Press]]
An
articulatory view of Kinyarwanda's coronal harmony
Rachel Walker, Dani Byrd, Fidèle Mpiranya
Acknowledgments to: Sungbok Lee, Celeste DeFreitas, and Brian Ronge
Funding: Provost's
Undergraduate
Research Program and
NIH DC03172
This
paper addresses theoretical issues surrounding coronal harmony through
an instrumental study of Kinyarwanda. Retroflex harmony in Kinyarwanda
causes alveolar [s, z] to become retroflex when preceding a retroflex
fricative within a stem. Intervening coronal stops, affricates,
and
palatal consonants block coronal harmony, but the flap and non-coronal
consonants are reported to be transparent. This harmony system
bears
on a theoretical debate. Some researchers have suggested that
coronal
harmony extends a continuous tongue tip-blade gesture (or feature) with
the result that the gesture is present during “transparent” segments,
but without perceptible effect (e.g. Flemming 1995, Ní
Chiosáin &
Padgett 1997, Gafos 1999). We refer to this as the Gesture Extension
Model. An alternative scenario posits that harmony causes the
tip-blade gesture to be repeated in a harmonizing consonant but does
not cause it to be present during intervening segments (e.g. Hansson
2001, Rose & Walker 2004). We refer to this as the Repeated
Gesture Model. Kinematic data on the production of consonants in
Kinyarwanda were collected using electromagnetic articulography.
The
mean angle for receivers placed on the tongue tip and blade were
calculated over the consonant intervals. Mean angle reliably
distinguished alveolar and retroflex fricatives, with alveolars showing
a lower tip relative to blade. Several issues were explored
involving
the status of target consonants, blockers, and transparent
segments.
Notably, in contexts where [m] and [k] are perceived as transparent,
their mean tip-blade angle was significantly different from contexts
where harmony did not occur. Furthermore, mean angle during
transparent [m] showed no significant difference from mean angle during
retroflex fricatives, suggesting that the tip-blade angle is sustained
systematically through transparent consonants but without perceptible
effect. This supports the Gesture Extension Model for coronal
harmony
in Kinyarwanda.
Functional
data analysis of prosodic effects on articulatory timing
Sungbok Lee, Dani Byrd,
Jelena Krivokapic
Funding: NIH DC03172
An application of functional data analysis
(FDA)
(Ramsay & Silverman, 1997) for linguistic experimentation is
explored.
The time-warping function provided by FDA is shown to offer novel
advantages
in the investigation of articulatory timing. Traditionally,
articulatory
studies examining the effects of linguistic variables such as prosody
on
articulatory timing have relied on kinematic landmarks to define speech
intervals of interest. However, we present a novel approach that
allows
the analysis of the entire, continuous kinematic trajectories obtained
in
various experimental conditions, specifically, in the presence or
absence
of a phrase boundary. FDA time warping functions after alignment
of
test and reference (control) signals indicate slowing of articulator
movement
as the speech stream recedes from the phrase boundary. This is a
theoretically
predicted pattern (Byrd & Saltzman, 2003), which would be more
difficult
to validate with a traditional interval-based approach. However,
there
exists tokens for which FDA is problematic, and some potential remedies
are outlined. Despite certain limitations, generally, FDA is
shown
to be a useful tool for characterizing timing patterns in linguistic
experimentation
based on continuous kinematic trajectories.
The complexities of how prosodic structure, both at the phrasal and syllable levels, shapes speech production have begun to be illuminated through studies of articulatory behavior. The present study contributes to an understanding of prosodic signatures on articulation by examining the joint effects of phrasal and syllable position on the production of consonants. Articulatory kinematic data were collected for five subjects using electromagnetic articulography (EMA) to record target consonants (labial, labiodental, & tongue tip), located in (1) either syllable final or initial position and (2) either at a phrase edge or phrase-medially. Spatial and temporal characteristics of the consonantal constriction formation and release were determined based on kinematic landmarks in the articulator velocity profiles. The results indicate that syllable and phrasal position consistently affect the movement duration; however, effects on displacement were more variable. For most subjects, the boundary-adjacent portion of the movement (constriction release for a pre-boundary coda and constriction formation for a post-boundary onset) are not differentially affected in terms of phrasal lengthening—both lengthen comparably.
Prosodic complexity and phrase length
as factors in pause duration
Jelena
Krivokapic
Research on
influences
on pauses has mainly focused on the impact of syntax, discourse
and prosodic structure on the likelihood of pause occurrence and on the
impact of syntactic structure on the duration of pauses within an
utterance. Very little is known about what factors, apart from
syntactic factors, play a role in
determining the length of pauses between utterances or phrases.
This experiment examines the effect of prosodic structure and
phrase length on pause duration. Subjects read 24 English sentences
varying along the following parameters: a) the length in syllables of
the intonational phrase preceding and following
the pause and b) the prosodic structure of the intonational phrase
preceding and following the pause, specifically whether or not
the intonational phrase branches into smaller phrases. In order
to minimize variability due to speech rate and individual differences,
speakers read sentences synchronously in dyads (Cummins 2002, Zvonik
& Cummins 2002). The results show that length has a significant
effect on pause duration both pre- and
post-boundary for all dyads and that prosodic complexity has a
significant post-boundary effect for some dyads. The possible reasons
for the observed pause duration effects and the implications of
these results on the question of incrementality in speech production
are discussed.
Funding: NIH DC0317
Back to Top
<>The influence of phonemic vowel length on the voicing effect
This study tests the influence
of
phonemic vowel length on the realization of the voicing effect, i.e.,
the
phonetic process by which vowels tend to be longer before voiced
obstruents
than before voiceless ones. The literature on the voicing effect has
identified
a number of factors that influence the degree of this effect (Hussein
1994),
among them the presence of phonemic vowel length (e.g. Keating 1985).
However
there appear to be no published reports of experiments to test this
claim
about the influence of contrastive vowel length. In order to test the
hypothesis
that the presence of phonemic vowel length attenuates the voicing
effect,
it is necessary to isolate phonemic vowel length from other possible
conditioning
factors. This can be done by testing a language where length is
contrastive
for a subset of its vowel qualities, i.e., a language that has some
unpaired
vowel for the long-short contrast. The prediction is that the vowel
without
a short/long counterpart will exhibit a stronger voicing effect than
vowels
part of a long/short contrast. Lithuanian shows such an asymmetrical
system.
Lithuanian mid vowels lack a contrast for duration; they are always
long.
Thus, the hypothesis is that the voicing effect will be greater for
/e:,
o:/ than for the other vowels.
Acoustic data from native speakers of Lithuanian was collected. The
stimuli
consisted of bisyllabic non-sense words of the shape CV1C1C2V, where V1
could
be any of the Lithuanian vowels and the sequence C1C2 was either /kS/
or
/gZ/. The results show that the difference in vowel duration before
voiced
obstruents and before voiceless ones, i.e., the voicing effect, is
greatest
for /e:/ and /o:/ (p<.05), compared to the other vowels. Our
experiment
concludes that the vowels unpaired for length (/e:, o:/) are more
impacted
by the voicing effect. Vowels with a long/short counterpart are
influenced
to a lesser degree. This supports our hypothesis. More generally, this
conclusion
provides evidence for the influence of phonemic contrast on phonetic
realization,
previously discussed in relation to coarticulation (Manuel 1999) and
the
cues to stress (Berinstein 1978). Furthermore, the asymmetrical
Lithuanian
system suggests the importance of minimal contrast in the phonological
representation.
If a vowel differs from another vowel only in length, then it minimally
contrasts
for length. Our experiment shows that vowels minimally contrastive for
length
behave differently from vowels that do not minimally contrast for
length.
What
is raddoppiamento?
Length and prosody in Italian
Rebeka
Campos-Astorkiza
Raddoppiamento fono-sintattico in Italian has received much attention in the literature. However, the phenomenon seems to be far from explained and understood. Traditionally, Raddoppiamento refers to a lengthening process that affects word-initial consonants that follow a word ending in a stressed vowel. Furthermore, prosodic and syntactic constraints have been posed that prevent this process from taking place (Nespor 1977, 1979). Unfortunately, most of the analyses and conclusions regarding Raddoppiamento instances lack a solid empirical foundation. This project aims at shedding light by introducing considerations about the nature of the segments in the Raddoppiamento environment and different prosodic contexts. We consider not only consonants but also vowels as possible lengthened segments and examine whether their behavior patterns with that of consonants. Second, two different prosodic contexts are considered. The two relevant words are placed either phrase-internally or at the boundary of a phrase. According to previous analyses, Raddoppiamento is unexpected at the phrase boundary (Nespor 1977, Vogel 1977). Lastly, stressed and unstressed environments are tested.
The results showed that lengthening took place
in the traditional Raddoppiamento environment, i.e., when word1 ends in
stressed vowel and word2 begins with a consonant and there is no
intervening boundary between them. On the other hand, when the
initial segment in word2 was a vowel, this did not lengthen. This
result shows that any attempt at explaining the process must deal with
the fact that only consonants are subject to lengthening. As far as the
final vowel in word1 is concerned, this was significantly longer when
it carried the stress than when it was unstressed. Finally, the
presence of a boundary did not block the process categorically.
At the phrase juncture, the initial consonant in word2 was
significantly longer when the preceding vowel was stressed than when
the latter was unstressed. In the view of this empirical evidence, some
of the accounts of Raddoppiamento will have to be revisited in order to
accommodate the data.
J.
Acoust. Soc. Am. 116, 2645 (2004)
Some
novel allophonic and phonemic phenomena in Biscayan Basque
Rebeka Campos-Astorkiza
An acoustic study of novel allophonic and phonemic phenomena in the isolate language Basque is presented. The focus is on speakers of the Biscayan dialect. First, Basque shows a spirantization process by which voiced plosives are produced as approximants, particularly intervocalically. Interestingly, we find that Basque /ld/ sequences, where spirantization is not expected [Hualde (1991) Basque Phonology], are realized as a lateral approximant followed by a voiced lateral fricative. Second, in this variety of Basque, the historical three-way contrast among sibilants (two alveolars and one postalveolar) has been reduced to a two-way distinction. The original contrast, still found in other varieties, between a laminal alveolar and an apical alveolar has merged with different results depending on the continuancy of the sibilants. Third, Basque presents a contrast between trill and flap intervocalically. However, elsewhere this is neutralized, and the precise realization of this segment varies from trill to frication. Finally, the Basque five-vowel inventory allows for almost any sequence of two vowels. The same vowel sequence might be a diphthong (tautosyllabic) or a hiatus (heterosyllabic) depending on the lexical item. That is, diphthongs and hiatus are contrastive.
J. Acoust. Soc. Am. 118, 1901 (2005)Phonetic foundations of final /s/ patterning in South-Central
Castilian Spanish
Ana Sanchez-Muñoz
Numerous studies have shown that in many
dialects of Spanish, the phoneme /s/ in final position may have several
realizations depending on a variety of factors (e.g. Lipski 1986;
Terrell 1979, 1981; Widdison 1995, 1996). However, there are few
analyses
of dialectal regions such as the area of South-West Central Castilian
Spanish, which because of its geographical location does not entirely
belong to any of the dialectal areas described for
Castilian. This study explores what factors may be having an effect
on the different realizations of final /s/. It considers aspiration
([h]), deletion, velarization ([x]), or realization of the sibilant
([s]). Data were collected from six native speakers under a controlled
task. This experiment aimed at production of final /-s/ in three
grammatical words (los, mis and estos/éstos), taking into
account the following factors: 1) The sound that follows the target
final /s/ (consonant or vowel); 2) Whether the target word is phrase
final or not; 3) Whether the target word carries focal accent or not.
The results show that the first two factors are
highly significant whereas the third one is not. It is furthermore
observed that certain realizations of final /s/ are restricted by the
type of sound that follows it, specifically for [x] after /k/. The
results show clear patterns of /s/ realization as [s] mainly occurs
before vowels and in prepausal position and [ø] mainly before
consonants. It is argued that s-lenition in Spanish can be explained in
terms of two variations in the gestural score: changes in magnitude and
overlap among gestures. These results help understand in greater depth
the mechanisms leading to the different realizations of syllable-final
/s/ in Spanish as characterized in the proposed hypothesis.
Standard Eastern Armenian has a three-way VOT contrast among its oral stop series. While it is rare for such a 3-way contrast to be preserved in final position, Standard Eastern Armenian is claimed to do just this (Ladefoged & Maddieson 1996; Vaux 1998; Khachaturian 1988). This phenomenon provides an ideal opportunity to explore how prosodic structure influences the realization of a complex and delicate system of contrast maintained by temporal and saliency distinctions. The cues to this contrast, including VOT, closure duration, and burst amplitude, are examined in a variety of segmental and prosodic environments. The experiment evaluates effects of intonation phrase final, intermediate phrase final, word final and syllable final positions. We find that the 3-way VOT contrast is maintained in these prosodic domains in which the stop consonants are final. Speakers make significant VOT distinctions among the target consonants within the same boundary condition, and across prosodic conditions, larger prosodic domains exhibit longer VOT values.
In American English, there exists a tense/lax distinction among the front vowels [i]/[I] and [e]/[E]. These often create minimal pairs in a number of different environments, however, before a velar nasal, the contrast is not preserved and is usually considered to have conflated to only the set of lax vowels. However, Ladefoged (2001) states that for the high front vowels “many younger Americans pronounce ‘sing’ with a vowel closer to that in ‘beat’ rather than to that in ‘bit.’” An acoustic study was conducted with nine young native Californians to examine whether raising occurs before the velar nasal, as has been anecdotally observed for the front high vowels, and if so, whether it also occurs for the front mid vowels, which have not previously been reported to raise from the lax. The results indicate that these subjects produce a vowel intermediate in formant frequencies (and sometimes, though not always, in duration) between the tense and lax vowel in the velar nasal environment. Further, this occurs for both the high and mid front vowels.
J.
Acoust. Soc. Am. 116, 2630 (2004)
This paper examines the phonetic structure of focus in Turkish. Focus is marked in different ways in different languages. Three most common ways of focus marking are via pitch perturbation, higher intensity and lengthening (O'Shaughnessy 1979, O'Shaughnessy & Allen 1983, Cooper et al 1985, Wells 1986). In many languages such as English, German, and Dutch, duration, amplitude, and pitch combine to give the effect of perceived stress in words that are in focus (Selkirk 1984, 1994; Gussenhoven 1983, 1994). Other languages, such as Danish and Spanish, use only pitch and amplitude as cues to focus, and, in these languages, durational effects play an unimportant role (Noteboom & Kryut 1987, Toledo 1989). Turkish is a scrambling language where the word order is flexible. While basic word order is SOV, almost all 6 permutations are possible in many sentences. In Turkish, broad focus is marked with word order as well as a pitch perturbation, typically a H tone. Generally, the accented word preceding the verb is in focus, unless the verb is in utterance initial position, in which case the verb itself is in focus. In narrow focus, however, the focused word does not immediately precede the verb. An experiment is conducted to examine whether focus in Turkish is marked by durational lengthening in addition to pitch perturbation and word order. The hypothesis that duration does not have an importance in broad focus due to the presence of the very salient pitch and word order cues is investigated. Further, we hypothesize that durational effects may play an important role in narrow focus since it is otherwise not as saliently marked in that it lacks the canonical word order cue. Data is collected from four native Turkish speakers in a controlled task. Stimuli include (i) four sentences, with phonetically identical subjects and objects, in which word order specifies broad focus and (ii) a second parallel set of sentences with narrow focus (i.e. in which the focused noun is not in canonical per-verb position) for comparison. Seven repetitions of each sentence are recorded and digitized. Phonetic lengthening in broad and narrow focus will be measured from the waveforms and statistically evaluated to confirm or refute the experimental hypotheses.
In Spanish, /b, d, g/ are
usually spirantized to voiced approximants in all syllabic contexts
after a continuant sound. However, in North-Central Peninsular Spanish
(NCS), spirantization interacts with coda devoicing,
yielding voiceless fricatives. In the majority of cases, coda
/b, d, g/ occur in stressed syllables. This work examines whether
stress is a factor in the likelihood of frication and devoicing of coda
/b, d, g/ in this dialect. An acoustic study was conducted of nine
native speakers from NCS. These speakers were tested
on nonce words with /b, d, g/ in coda position in both stressed
and unstressed syllables. Measurements were made of vowel and
consonant duration, presence and absence of frication and voicing, and
voicing duration. The results show that frication is more likely in
stressed syllables than in unstressed syllables.
This suggests that in stressed syllables, a higher subglottal pressure
produces higher airflow across the glottis, which favors frication. In
turn, frication inhibits voicing due to conflicting aerodynamic
requirements between the two. We conclude that stress is a factor in
spirantization, and that it may indirectly affect
the voicing properties of /b, d, g/.
C. Gonzalez. (2002) Phonetic variation in voiced obstruents
in North-Central Penninsular Spanish. Journal of the
International Phonetic Association.32(1). [pdf]
This project investigates the relation between the spatiotemporal orchestration of the vocal movements used to produce speech and a particular aspect of linguistic structure—word stress. The proposal describes a study employing a previously collected articulatory database of nearly a thousand spoken productions of words differing only in their stress pattern. This database was collected using a magnetometer system to track articulator movement during speech and includes data on tongue, lip and jaw trajectories for three subjects. The kinematic characteristics of these movements will be analyzed as a function of stress condition and phrasal prominence. The proposed study will profile the manner in which articulatory behavior is shaped by the linguistic dimension of word stress.
Formulations
of
flapping as a symbolic phonological rule suggest clear articulatory
differences between flaps and stops, and often offer no overt
explanation for why phrase boundaries should block the
alternation. The present study explores the articulatory
foundation of the distinction between flaps and non-flaps in word-final
position. We examine kinematic and acoustic data for these
articulations in phrase-final and -medial positions and in falling
and level stress contours. It is shown that a discrepancy exists
between acoustic and articulatory durational patterning while acoustic
durations of flaps are shorter than those of non-flaps overall, their
articulatory durations are not uniformly so.
It is important to consider multiple potential articulatory sources
both spatial and temporal for the acoustic shortness that characterizes
flaps, including spatial reduction, temporal articulatory shortening,
and changes in intergestural coordination.
The kinematic data indicate that different sources of flap shortness
exist for different speakers and different prosodic conditions.
These results imply that flaps are not categorically different from
non-flaps in the articulatory domain, as traditional formulations of
flapping as a symbolic phonological rule would suggest.
We conclude that the gradient variability in the spatiotemporal
patterning of tongue tip constrictions yields acoustic shortening
in flaps.
T.
Fukaya & D. Byrd (2005) An articulatory examination of word-final
flapping at phrase edges and interiors. Journal of the International
Phonetics Association 35, 1. [pdf]
Recent research has determined that coarticulatory information in speech provides important cues to early word segmentation. This experiment investigates whether 7-month old infants' ability to recognize a string requires the presence of appropriate coarticulatory information in the speech familiarization stream. Following familiarization to a string of CV syllables, infants were tested to determine if sequences that co-occurred in the familiarization string were preferred over those in which the syllables did not appear adjacently during familiarization. Further, the test phase was conducted so that the items had either appropriate or inappropriate coarticulation information. The results indicate that infants tested on items with appropriate coarticulation listened significantly longer to strings that had appeared during familiarization than to the appropriately coarticulated control strings that never occurred together during familiarization. Interestingly, when presented with inappropriate coarticulation test items, infants showed no preference for previously familiarized strings over the non-co-occurring syllable strings. We conclude that infants are sensitive to coarticulation in recognizing sequences in a speech stream. Furthermore, coarticulatory cues, in combination with other cues to segmentation, greatly enhance recognition of syllable sequences. These results suggest that coarticulation plays an important role in early word segmentation.
A widespread pattern is the
exclusion of similar sound elements in a word is known
as a phonological cooccurrence restriction. When word formation
produces a structure containing sounds that violate a cooccurrence
restriction, the phonological form of the word is altered in
one of two ways in order to obey the condition: the prohibited sounds
either dissimilate (become less similar) or assimilate (become
identical). We are investigating a hypothesis that bans on similar but
different elements have a foundation in psycholinguistic
processing. By conducting speech error experiments, we will investigate
the factors contributing to production ease in words containing
consonants that differ in voicing/nasality. The results will
have important implications for shaping a crosslinguistic typology of
voicing/nasality cooccurrence effects. Explaining phonological
cooccurrence restrictions in terms of maximizing production and
processing ease is an exciting new interdisciplinary research direction
combining linguistics and psychology. This research promises not only
to bring new understanding to the study of widespread cooccurrence
patterns, but also to mark a significant
advance in determining what universal factors underlie properties
of human language.