Spontaneous speech events in two speech databases of human-computer and human-human dialogs in Spanish.

Abstract:

:Previous works in English have revealed that disfluencies follow regular patterns and that incorporating them into the language model of a speech recognizer leads to lower perplexities and sometimes to a better performance. Although work on disfluency modeling has been applied outside the English community (e.g., in Japanese), as far as we know there is no specific work dealing with disfluencies in Spanish. In this paper, we follow a data driven approach in exploring the potential benefit of modeling disfluencies in a speech recognizer in Spanish. Two databases of human-computer and human-human dialogs are considered, which allow the absolute and relative frequencies of disfluencies in the two situations to be compared. The rate of disfluencies in human-human dialogs is found to be very close to that found for similar databases in English. Due to setup factors, the rate of disfluencies found in human-computer dialogs was remarkably higher than that reported for similar databases in English. In any case, from the point of view of speech recognition, the high frequencies of disfluencies and the distinct features of the acoustic events related to them support the need for explicit acoustic models. The regularities observed in the distribution of filled pauses and speech repairs reveal that including them in the language model of the speech recognizer may be also helpful. The extent to which the number of events depends on utterance length and on the speaker is also explored. Statistics are shown that follow previous studies for English, and a sizeable space is devoted to comparing our results with them. Finally, various possible cues for the automatic detection of speech repairs--a key issue from the point of view of speech understanding--are explored: silent pauses, filled pauses, lengthenings, cut off words and discourse markers. As previously observed for English, none of them was found to be reliable by itself. More information, especially at the acoustic-prosodic level, is no doubt needed to reliably detect speech repairs.

journal_name

Lang Speech

journal_title

Language and speech

authors

Rodríguez LJ,Inés Torres M

doi

10.1177/00238309060490030201

subject

Has Abstract

pub_date

2006-01-01 00:00:00

pages

333-66

issue

Pt 3

eissn

0023-8309

issn

1756-6053

journal_volume

49

pub_type

杂志文章
  • Base-language effects on word identification in bilingual speech: evidence from categorical perception experiments.

    abstract::The categorical perception paradigm was used to investigate whether French-English bilinguals categorize a code-switched word as French or English on the basis of its acoustic-phonetic information alone or whether they are influenced by the base-language context in which the word occurs, that is, by the language in wh...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383098903200404

    authors: Bürki-Cohen J,Grosjean F,Miller JL

    更新日期:1989-10-01 00:00:00

  • Individual Differences in the Adoption of Sound Change.

    abstract::It is still unclear whether an individual's adoption of on-going sound change starts in production or in perception, and what the time course of the adoption of sound change is in adult speakers. These issues are investigated by means of a large-scale (106 participants) laboratory study of an on-going vowel shift in D...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830920959753

    authors: Voeten CC

    更新日期:2020-10-25 00:00:00

  • Informativeness, Timing and Tempo in Lexical Self-Repair.

    abstract::This paper presents a study of the temporal organization of lexical repair in spontaneous Dutch speech. It assesses the extent to which offset-to-repair duration and repair tempo can be predicted on the basis of offset timing, reparandum tempo and measures of the informativeness of the crucial lexical items in the rep...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830915618427

    authors: Plug L

    更新日期:2016-12-01 00:00:00

  • Event-related potentials reflecting the processing of phonological constraint violations.

    abstract::How are violations of phonological constraints processed in word comprehension? The present article reports the results of an event-related potentials (ERP) study on a phonological constraint of German that disallows identical segments within a syllable or word (CC(i)VC(i)). We examined three types of monosyllabic lat...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830909336581

    authors: Domahs U,Kehrein W,Knaus J,Wiesel R,Schlesewsky M

    更新日期:2009-01-01 00:00:00

  • Sorry, Not Sorry: The independent role of multiple phonetic cues in signaling the difference between two word meanings.

    abstract::We examine the use of multiple subphonemic differences distinguishing homophones in production and perception, through a case study focusing on the distinction between two polysemous senses of the English word "sorry" (apology vs. attention-seeking). An analysis of production data from voice actors revealed significan...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830921988975

    authors: Martinuzzi C,Schertz J

    更新日期:2021-01-28 00:00:00

  • What does more time buy you? Another look at the effects of long-term residence on production accuracy of English /inverted r/ and /l/ by Japanese speakers.

    abstract::This study tested the issue of whether extended length of residence (LOR) in adulthood can provide sufficient input to overcome age effects. The study replicates Flege, Takagi, and Mann (1995), which found that 10 out of 12 Japanese learners of English with extensive residence (12 years or more) produced liquids as ac...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309060490040401

    authors: Larson-Hall J

    更新日期:2006-01-01 00:00:00

  • The function of sentence accents and given/new information in speech processing: different strategies for normal-hearing and hearing-impaired listeners?

    abstract::Two experiments were carried out to investigate how the correspondence between sentence accentuation and distribution of information is used in human word processing. A forced-choice task with target words embedded in sentences was employed for this purpose. Target words provided either 'given' or 'new' information, a...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383099403700403

    authors: van Donselaar W,Lentz J

    更新日期:1994-10-01 00:00:00

  • Effects of Valence on Hemispheric Specialization for Emotion Word Processing.

    abstract::The use of emotion in language is a key element of human interactions and a rich area for cognitive research. The present study examined reactions to words of five types: positive emotion (e.g., happiness), negative emotion (e.g., hatred), positive emotion-laden (e.g., blessing), negative emotion-laden (e.g., prison),...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830916686128

    authors: Martin JM,Altarriba J

    更新日期:2017-12-01 00:00:00

  • Early sound patterns in the speech of two Brazilian Portuguese speakers.

    abstract::Sound patterns in the speech of two Brazilian-Portuguese speaking children are compared with early production patterns in English-learning children as well as English and Brazilian-Portuguese (BP) characteristics. The relationship between production system effects and ambient language influences in the acquisition of ...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309020450020401

    authors: Teixeira ER,Davis BL

    更新日期:2002-06-01 00:00:00

  • Crossmodal and incremental perception of audiovisual cues to emotional speech.

    abstract::In this article we report on two experiments about the perception of audiovisual cues to emotional speech. The article addresses two questions: 1) how do visual cues from a speaker's face to emotion relate to auditory cues, and (2) what is the recognition speed for various facial cues to emotion? Both experiments repo...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830909348993

    authors: Barkhuysen P,Krahmer E,Swerts M

    更新日期:2010-01-01 00:00:00

  • Effects of lexical stress in auditory word recognition.

    abstract::Although research examining the use of prosodic information in the processing of spoken words has increased in recent years, results from these studies have been inconclusive. The present series of experiments systematically examines the importance of one prosodic variable (lexical stress) in the recognition of isolat...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383099003300104

    authors: Slowiaczek LM

    更新日期:1990-01-01 00:00:00

  • The role of tonal onglides in German nuclear pitch accents.

    abstract::A perception experiment with native German listeners provided evidence for the relevance of the tonal onglide in nuclear accents--the pitch movement leading towards the target on the accented syllable. Listeners were able to distinguish between two pragmatic meanings of a short phrase (given/non-contrastive and new/co...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830914565688

    authors: Ritter S,Grice M

    更新日期:2015-03-01 00:00:00

  • A comparative investigation of coarticulation in fricatives: electropalatographic, electromagnetic, and acoustic data.

    abstract::The principal aim of this investigation was to compare coarticulatory effects at different levels of the speech production system, in order to gain insight into the relations between the different levels. To this end, the relative magnitudes of carryover and anticipatory coarticulation with adjacent vowels were measur...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383099303600307

    authors: Hoole P,Nguyen-Trong N,Hardcastle W

    更新日期:1993-04-01 00:00:00

  • Stress in ASL: empirical evidence and linguistic issues.

    abstract::The study of signed languages provides an opportunity to identify those characteristics of language that are universal and to investigate the effect of production modality (signed vs. spoken) on the grammar. Over time, American Sign Language (ASL) has accommodated itself to the production and perception requirements o...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309990420020501

    authors: Wilbur RB

    更新日期:1999-04-01 00:00:00

  • Identification of acoustically modified Mandarin tones by non-native listeners.

    abstract::This study investigated identification of fragmented Mandarin tones by non-native listeners. Monosyllabic Mandarin words were digitally processed to generate intact, silent-center, center-only, and onset-only syllables. The syllables were recorded with two carrier phrases such that the offset of the carrier tone and t...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830909357160

    authors: Lee CY,Tao L,Bond ZS

    更新日期:2010-01-01 00:00:00

  • Comprehension of temporal terms by good and poor readers.

    abstract::Poor beginning readers often have difficulty comprehending spoken sentences with complex syntactic structures. This study attempts to identify the reasons for this difficulty. Second-grade good and poor readers were tested for comprehension of spoken sentences containing the temporal terms before and after. Processing...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383098903200103

    authors: Macaruso P,Bar-Shalom E,Crain S,Shankweiler D

    更新日期:1989-01-01 00:00:00

  • Interspeaker variability in emphatic accent production in French.

    abstract::This research aims (1) to describe the acoustic manifestations of emphatic accent in French by examining similarities and differences between four speakers; and (2) to identify, amongst the acoustic measures, those which determine the perception of emphasis. In experiment 1, four speakers were asked to read twenty-fou...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383099603900402

    authors: Dahan D,Bernard JM

    更新日期:1996-10-01 00:00:00

  • Dialect effects in vowel perception: the role of temporal information in French.

    abstract::The importance of vowel duration for specifying vowel contrasts differs across languages. In English, for example, a number of vowel pairs are acoustically differentiated by both temporal and spectral information, whereas in standard French temporal information plays a much more minor role. Gottfried and Beddor (1988)...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383099704000304

    authors: Miller JL,Grosjean F

    更新日期:1997-07-01 00:00:00

  • Acoustic parameters in human speaker recognition.

    abstract::Four speaker identification tests were conducted using five female speakers known to the listeners. Starting from acoustic recordings of reiterant "ma" syllables, the perceptual importance of the following three factors was investigated: F0 height, F0 contour, and speech rhythm. For speakers with typically low or high...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383099003300302

    authors: van Dommelen WA

    更新日期:1990-07-01 00:00:00

  • The perception of phonological quantity based on durational cues by native speakers, second-language users and nonspeakers of Finnish.

    abstract::Some languages, such as Finnish, use speech-sound duration as the primary cue for a phonological quantity distinction. For second-language (L2) learners, quantity is often difficult to master if speech-sound duration plays a less important role in the phonology of their native language (L1). By comparing the categoriz...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309050480030401

    authors: Ylinen S,Shestakova A,Alku P,Huotilainen M

    更新日期:2005-01-01 00:00:00

  • Combining techniques to reveal emergent effects in infants' segmentation, word learning, and grammar.

    abstract::This paper provides three representative examples that highlight the ways in which procedures can be combined to study interactions across traditional domains of study: segmentation, word learning, and grammar. The first section uses visual familiarization prior to the Headturn Preference Procedure to demonstrate that...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309060490010201

    authors: Hollich G

    更新日期:2006-01-01 00:00:00

  • Communicative Success in Spatial Dialogue: The Impact of Functional Features and Dialogue Strategies.

    abstract::This paper addresses the impact of dialogue strategies and functional features of spatial arrangements on communicative success. To examine the sharing of cognition between two minds in order to achieve a joint goal, we collected a corpus of 24 extended German-language dialogues in a referential communication task tha...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830916651097

    authors: Tenbrink T,Andonova E,Schole G,Coventry KR

    更新日期:2017-06-01 00:00:00

  • Early Speech Segmentation in French-learning Infants: Monosyllabic Words versus Embedded Syllables.

    abstract::Lexical acquisition relies on many mechanisms, one of which corresponds to segmentation abilities, that is, the ability to extract word forms from fluent speech. This ability is important since words are rarely produced in isolation even when talking to infants. The present study explored whether young French-learning...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830914551375

    authors: Nishibayashi LL,Goyet L,Nazzi T

    更新日期:2015-09-01 00:00:00

  • Effects of vowel duration and vowel quality on vowel-to-vowel coarticulation.

    abstract::This work investigates how vowel duration and vowel quality affect degrees of vowel-to-vowel coarticulation. The effects of these two factors on vowel-to-vowel coarticulation have previously received little study. Phonological durational differences due to vowel length distinction were examined in Thai. It was hypothe...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830911404961

    authors: Mok PP

    更新日期:2011-12-01 00:00:00

  • Durational patterning at syntactic and discourse boundaries in Mandarin spontaneous speech.

    abstract::This study focused on durational cues (i.e., syllable duration, pause duration, and syllable onset intervals (SOIs)) at discourse boundaries in two dialects of Mandarin,Taiwan and Mainland varieties. Speech was elicited by having 18 participants describe events in The Pear Story film. Recorded data were transcribed, l...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830910372492

    authors: Fon J,Johnson K,Chen S

    更新日期:2011-03-01 00:00:00

  • Linking Variation in Perception and Production in Sound Change: Evidence from Dutch Obstruent Devoicing.

    abstract::This study investigates the link between the perception and production in sound change in progress, both at the regional and the individual level. Two devoicing processes showing regional variation in Dutch are studied: the devoicing of initial labiodental fricatives and of initial bilabial stops. Five regions were se...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830919880206

    authors: Pinget AF,Kager R,Van de Velde H

    更新日期:2020-09-01 00:00:00

  • Biomechanically preferred consonant-vowel combinations fail to appear in adult spoken corpora.

    abstract::Certain consonant/vowel (CV) combinations are more frequent than would be expected from the individual C and V frequencies alone, both in babbling and, to a lesser extent, in adult language, based on dictionary counts: Labial consonants co-occur with central vowels more often than chance would dictate; coronals co-occ...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830911434123

    authors: Whalen DH,Giulivi S,Nam H,Levitt AG,Hallé P,Goldstein LM

    更新日期:2012-12-01 00:00:00

  • Interaction of native- and second-language vowel system(s) in early and late bilinguals.

    abstract::The objective of this study was to determine how bilinguals' age at the time of language acquisition influenced the organization of their phonetic system(s). The productions of six English and five Korean vowels by English and Korean monolinguals were compared to the productions of the same vowels by early and late Ko...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309050480010101

    authors: Baker W,Trofimovich P

    更新日期:2005-01-01 00:00:00

  • Syllable onset intervals as an indicator of discourse and syntactic boundaries in Taiwan Mandarin.

    abstract::This study looks at the syllable onset interval (SOI) patterning in Taiwan Mandarin spontaneous speech and its relationship to discourse and syntactic units. Monologs were elicited by asking readers to tell stories depicted in comic strips and were transcribed and segmented into Discourse Segment Units (Grosz & Sidner...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309040470010301

    authors: Fon J,Johnson K

    更新日期:2004-01-01 00:00:00

  • Polynomial modeling of child and adult intonation in German spontaneous speech.

    abstract::In a data set of 291 spontaneous utterances from German 5-year-olds, 7-year-olds and adults, nuclear pitch contours were labeled manually using the GToBI annotation system.Ten different contour types were identified.The fundamental frequency (F0) of these contours was modeled using third-order orthogonal polynomials, ...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830910397495

    authors: de Ruiter LE

    更新日期:2011-06-01 00:00:00