Abstract:
:Previous works in English have revealed that disfluencies follow regular patterns and that incorporating them into the language model of a speech recognizer leads to lower perplexities and sometimes to a better performance. Although work on disfluency modeling has been applied outside the English community (e.g., in Japanese), as far as we know there is no specific work dealing with disfluencies in Spanish. In this paper, we follow a data driven approach in exploring the potential benefit of modeling disfluencies in a speech recognizer in Spanish. Two databases of human-computer and human-human dialogs are considered, which allow the absolute and relative frequencies of disfluencies in the two situations to be compared. The rate of disfluencies in human-human dialogs is found to be very close to that found for similar databases in English. Due to setup factors, the rate of disfluencies found in human-computer dialogs was remarkably higher than that reported for similar databases in English. In any case, from the point of view of speech recognition, the high frequencies of disfluencies and the distinct features of the acoustic events related to them support the need for explicit acoustic models. The regularities observed in the distribution of filled pauses and speech repairs reveal that including them in the language model of the speech recognizer may be also helpful. The extent to which the number of events depends on utterance length and on the speaker is also explored. Statistics are shown that follow previous studies for English, and a sizeable space is devoted to comparing our results with them. Finally, various possible cues for the automatic detection of speech repairs--a key issue from the point of view of speech understanding--are explored: silent pauses, filled pauses, lengthenings, cut off words and discourse markers. As previously observed for English, none of them was found to be reliable by itself. More information, especially at the acoustic-prosodic level, is no doubt needed to reliably detect speech repairs.
journal_name
Lang Speechjournal_title
Language and speechauthors
Rodríguez LJ,Inés Torres Mdoi
10.1177/00238309060490030201subject
Has Abstractpub_date
2006-01-01 00:00:00pages
333-66issue
Pt 3eissn
0023-8309issn
1756-6053journal_volume
49pub_type
杂志文章abstract::The categorical perception paradigm was used to investigate whether French-English bilinguals categorize a code-switched word as French or English on the basis of its acoustic-phonetic information alone or whether they are influenced by the base-language context in which the word occurs, that is, by the language in wh...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/002383098903200404
更新日期:1989-10-01 00:00:00
abstract::It is still unclear whether an individual's adoption of on-going sound change starts in production or in perception, and what the time course of the adoption of sound change is in adult speakers. These issues are investigated by means of a large-scale (106 participants) laboratory study of an on-going vowel shift in D...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830920959753
更新日期:2020-10-25 00:00:00
abstract::This paper presents a study of the temporal organization of lexical repair in spontaneous Dutch speech. It assesses the extent to which offset-to-repair duration and repair tempo can be predicted on the basis of offset timing, reparandum tempo and measures of the informativeness of the crucial lexical items in the rep...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830915618427
更新日期:2016-12-01 00:00:00
abstract::How are violations of phonological constraints processed in word comprehension? The present article reports the results of an event-related potentials (ERP) study on a phonological constraint of German that disallows identical segments within a syllable or word (CC(i)VC(i)). We examined three types of monosyllabic lat...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830909336581
更新日期:2009-01-01 00:00:00
abstract::We examine the use of multiple subphonemic differences distinguishing homophones in production and perception, through a case study focusing on the distinction between two polysemous senses of the English word "sorry" (apology vs. attention-seeking). An analysis of production data from voice actors revealed significan...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830921988975
更新日期:2021-01-28 00:00:00
abstract::This study tested the issue of whether extended length of residence (LOR) in adulthood can provide sufficient input to overcome age effects. The study replicates Flege, Takagi, and Mann (1995), which found that 10 out of 12 Japanese learners of English with extensive residence (12 years or more) produced liquids as ac...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/00238309060490040401
更新日期:2006-01-01 00:00:00
abstract::Two experiments were carried out to investigate how the correspondence between sentence accentuation and distribution of information is used in human word processing. A forced-choice task with target words embedded in sentences was employed for this purpose. Target words provided either 'given' or 'new' information, a...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/002383099403700403
更新日期:1994-10-01 00:00:00
abstract::The use of emotion in language is a key element of human interactions and a rich area for cognitive research. The present study examined reactions to words of five types: positive emotion (e.g., happiness), negative emotion (e.g., hatred), positive emotion-laden (e.g., blessing), negative emotion-laden (e.g., prison),...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830916686128
更新日期:2017-12-01 00:00:00
abstract::Sound patterns in the speech of two Brazilian-Portuguese speaking children are compared with early production patterns in English-learning children as well as English and Brazilian-Portuguese (BP) characteristics. The relationship between production system effects and ambient language influences in the acquisition of ...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/00238309020450020401
更新日期:2002-06-01 00:00:00
abstract::In this article we report on two experiments about the perception of audiovisual cues to emotional speech. The article addresses two questions: 1) how do visual cues from a speaker's face to emotion relate to auditory cues, and (2) what is the recognition speed for various facial cues to emotion? Both experiments repo...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830909348993
更新日期:2010-01-01 00:00:00
abstract::Although research examining the use of prosodic information in the processing of spoken words has increased in recent years, results from these studies have been inconclusive. The present series of experiments systematically examines the importance of one prosodic variable (lexical stress) in the recognition of isolat...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/002383099003300104
更新日期:1990-01-01 00:00:00
abstract::A perception experiment with native German listeners provided evidence for the relevance of the tonal onglide in nuclear accents--the pitch movement leading towards the target on the accented syllable. Listeners were able to distinguish between two pragmatic meanings of a short phrase (given/non-contrastive and new/co...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830914565688
更新日期:2015-03-01 00:00:00
abstract::The principal aim of this investigation was to compare coarticulatory effects at different levels of the speech production system, in order to gain insight into the relations between the different levels. To this end, the relative magnitudes of carryover and anticipatory coarticulation with adjacent vowels were measur...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/002383099303600307
更新日期:1993-04-01 00:00:00
abstract::The study of signed languages provides an opportunity to identify those characteristics of language that are universal and to investigate the effect of production modality (signed vs. spoken) on the grammar. Over time, American Sign Language (ASL) has accommodated itself to the production and perception requirements o...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/00238309990420020501
更新日期:1999-04-01 00:00:00
abstract::This study investigated identification of fragmented Mandarin tones by non-native listeners. Monosyllabic Mandarin words were digitally processed to generate intact, silent-center, center-only, and onset-only syllables. The syllables were recorded with two carrier phrases such that the offset of the carrier tone and t...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830909357160
更新日期:2010-01-01 00:00:00
abstract::Poor beginning readers often have difficulty comprehending spoken sentences with complex syntactic structures. This study attempts to identify the reasons for this difficulty. Second-grade good and poor readers were tested for comprehension of spoken sentences containing the temporal terms before and after. Processing...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/002383098903200103
更新日期:1989-01-01 00:00:00
abstract::This research aims (1) to describe the acoustic manifestations of emphatic accent in French by examining similarities and differences between four speakers; and (2) to identify, amongst the acoustic measures, those which determine the perception of emphasis. In experiment 1, four speakers were asked to read twenty-fou...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/002383099603900402
更新日期:1996-10-01 00:00:00
abstract::The importance of vowel duration for specifying vowel contrasts differs across languages. In English, for example, a number of vowel pairs are acoustically differentiated by both temporal and spectral information, whereas in standard French temporal information plays a much more minor role. Gottfried and Beddor (1988)...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/002383099704000304
更新日期:1997-07-01 00:00:00
abstract::Four speaker identification tests were conducted using five female speakers known to the listeners. Starting from acoustic recordings of reiterant "ma" syllables, the perceptual importance of the following three factors was investigated: F0 height, F0 contour, and speech rhythm. For speakers with typically low or high...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/002383099003300302
更新日期:1990-07-01 00:00:00
abstract::Some languages, such as Finnish, use speech-sound duration as the primary cue for a phonological quantity distinction. For second-language (L2) learners, quantity is often difficult to master if speech-sound duration plays a less important role in the phonology of their native language (L1). By comparing the categoriz...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/00238309050480030401
更新日期:2005-01-01 00:00:00
abstract::This paper provides three representative examples that highlight the ways in which procedures can be combined to study interactions across traditional domains of study: segmentation, word learning, and grammar. The first section uses visual familiarization prior to the Headturn Preference Procedure to demonstrate that...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/00238309060490010201
更新日期:2006-01-01 00:00:00
abstract::This paper addresses the impact of dialogue strategies and functional features of spatial arrangements on communicative success. To examine the sharing of cognition between two minds in order to achieve a joint goal, we collected a corpus of 24 extended German-language dialogues in a referential communication task tha...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830916651097
更新日期:2017-06-01 00:00:00
abstract::Lexical acquisition relies on many mechanisms, one of which corresponds to segmentation abilities, that is, the ability to extract word forms from fluent speech. This ability is important since words are rarely produced in isolation even when talking to infants. The present study explored whether young French-learning...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830914551375
更新日期:2015-09-01 00:00:00
abstract::This work investigates how vowel duration and vowel quality affect degrees of vowel-to-vowel coarticulation. The effects of these two factors on vowel-to-vowel coarticulation have previously received little study. Phonological durational differences due to vowel length distinction were examined in Thai. It was hypothe...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830911404961
更新日期:2011-12-01 00:00:00
abstract::This study focused on durational cues (i.e., syllable duration, pause duration, and syllable onset intervals (SOIs)) at discourse boundaries in two dialects of Mandarin,Taiwan and Mainland varieties. Speech was elicited by having 18 participants describe events in The Pear Story film. Recorded data were transcribed, l...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830910372492
更新日期:2011-03-01 00:00:00
abstract::This study investigates the link between the perception and production in sound change in progress, both at the regional and the individual level. Two devoicing processes showing regional variation in Dutch are studied: the devoicing of initial labiodental fricatives and of initial bilabial stops. Five regions were se...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830919880206
更新日期:2020-09-01 00:00:00
abstract::Certain consonant/vowel (CV) combinations are more frequent than would be expected from the individual C and V frequencies alone, both in babbling and, to a lesser extent, in adult language, based on dictionary counts: Labial consonants co-occur with central vowels more often than chance would dictate; coronals co-occ...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830911434123
更新日期:2012-12-01 00:00:00
abstract::The objective of this study was to determine how bilinguals' age at the time of language acquisition influenced the organization of their phonetic system(s). The productions of six English and five Korean vowels by English and Korean monolinguals were compared to the productions of the same vowels by early and late Ko...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/00238309050480010101
更新日期:2005-01-01 00:00:00
abstract::This study looks at the syllable onset interval (SOI) patterning in Taiwan Mandarin spontaneous speech and its relationship to discourse and syntactic units. Monologs were elicited by asking readers to tell stories depicted in comic strips and were transcribed and segmented into Discourse Segment Units (Grosz & Sidner...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/00238309040470010301
更新日期:2004-01-01 00:00:00
abstract::In a data set of 291 spontaneous utterances from German 5-year-olds, 7-year-olds and adults, nuclear pitch contours were labeled manually using the GToBI annotation system.Ten different contour types were identified.The fundamental frequency (F0) of these contours was modeled using third-order orthogonal polynomials, ...
journal_title:Language and speech
pub_type: 杂志文章
doi:10.1177/0023830910397495
更新日期:2011-06-01 00:00:00