Crossmodal and incremental perception of audiovisual cues to emotional speech.

Abstract:

:In this article we report on two experiments about the perception of audiovisual cues to emotional speech. The article addresses two questions: 1) how do visual cues from a speaker's face to emotion relate to auditory cues, and (2) what is the recognition speed for various facial cues to emotion? Both experiments reported below are based on tests with video clips of emotional utterances collected via a variant of the well-known Velten method. More specifically, we recorded speakers who displayed positive or negative emotions, which were congruent or incongruent with the (emotional) lexical content of the uttered sentence. In order to test this, we conducted two experiments. The first experiment is a perception experiment in which Czech participants, who do not speak Dutch, rate the perceived emotional state of Dutch speakers in a bimodal (audiovisual) or a unimodal (audio- or vision-only) condition. It was found that incongruent emotional speech leads to significantly more extreme perceived emotion scores than congruent emotional speech, where the difference between congruent and incongruent emotional speech is larger for the negative than for the positive conditions. Interestingly, the largest overall differences between congruent and incongruent emotions were found for the audio-only condition, which suggests that posing an incongruent emotion has a particularly strong effect on the spoken realization of emotions. The second experiment uses a gating paradigm to test the recognition speed for various emotional expressions from a speaker's face. In this experiment participants were presented with the same clips as experiment I, but this time presented vision-only. The clips were shown in successive segments (gates) of increasing duration. Results show that participants are surprisingly accurate in their recognition of the various emotions, as they already reach high recognition scores in the first gate (after only 160 ms). Interestingly, the recognition scores raise faster for positive than negative conditions. Finally, the gating results suggest that incongruent emotions are perceived as more intense than congruent emotions, as the former get more extreme recognition scores than the latter, already after a short period of exposure.

journal_name

Lang Speech

journal_title

Language and speech

authors

Barkhuysen P,Krahmer E,Swerts M

doi

10.1177/0023830909348993

subject

Has Abstract

pub_date

2010-01-01 00:00:00

pages

3-30

issue

Pt 1

eissn

0023-8309

issn

1756-6053

journal_volume

53

pub_type

杂志文章
  • A commercial large-vocabulary discrete speech recognition system: DragonDictate.

    abstract::DragonDictate is currently the only commercially available general-purpose, large-vocabulary speech recognition system. It uses discrete speech and is speaker-dependent, adapting to the speaker's voice and language model with every word. Its acoustic adaptability is based in a three-level phonology and a stochastic mo...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383099203500218

    authors: Mandel MA

    更新日期:1992-01-01 00:00:00

  • Early sound patterns in the speech of two Brazilian Portuguese speakers.

    abstract::Sound patterns in the speech of two Brazilian-Portuguese speaking children are compared with early production patterns in English-learning children as well as English and Brazilian-Portuguese (BP) characteristics. The relationship between production system effects and ambient language influences in the acquisition of ...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309020450020401

    authors: Teixeira ER,Davis BL

    更新日期:2002-06-01 00:00:00

  • Effects of lexical stress in auditory word recognition.

    abstract::Although research examining the use of prosodic information in the processing of spoken words has increased in recent years, results from these studies have been inconclusive. The present series of experiments systematically examines the importance of one prosodic variable (lexical stress) in the recognition of isolat...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383099003300104

    authors: Slowiaczek LM

    更新日期:1990-01-01 00:00:00

  • Auditory word serial recall benefits from orthographic dissimilarity.

    abstract::The influence of orthographic knowledge has been consistently observed in dissimilarity speech recognition and metaphonological tasks. The present study provides data suggesting that such influence also pervades other cognitive domains phonological related to language abilities, such as verbal working memory. Using se...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830910371450

    authors: Pattamadilok C,Lafontaine H,Morais J,Kolinsky R

    更新日期:2010-01-01 00:00:00

  • Interspeaker variability in emphatic accent production in French.

    abstract::This research aims (1) to describe the acoustic manifestations of emphatic accent in French by examining similarities and differences between four speakers; and (2) to identify, amongst the acoustic measures, those which determine the perception of emphasis. In experiment 1, four speakers were asked to read twenty-fou...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383099603900402

    authors: Dahan D,Bernard JM

    更新日期:1996-10-01 00:00:00

  • Auditory and visual cueing of the [+/- rounded] feature of vowels.

    abstract::That lipreading plays a role in phoneme recognition, even when the acoustic signal alone is phonologically unambiguous, has been concluded from experiments in the perception of discrepant combinations of acoustic and visual speech signals. Little is known about the effect of visual information on explicitly phonetic j...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383099203500402

    authors: Lisker L,Rossi M

    更新日期:1992-10-01 00:00:00

  • Syntactic structure guides prosody in temporarily ambiguous sentences.

    abstract::A pair of speaking and listening studies investigated the prosody of sentences with temporary Object/Clause and Late/Early Closure ambiguities. Speakers reliably produced prosodic cues that allowed listeners to disambiguate Late/Early Closure sentences, but only infrequently produced prosody that disambiguated Object/...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830910372497

    authors: Anderson C,Carlson K

    更新日期:2010-01-01 00:00:00

  • The role of tonal onglides in German nuclear pitch accents.

    abstract::A perception experiment with native German listeners provided evidence for the relevance of the tonal onglide in nuclear accents--the pitch movement leading towards the target on the accented syllable. Listeners were able to distinguish between two pragmatic meanings of a short phrase (given/non-contrastive and new/co...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830914565688

    authors: Ritter S,Grice M

    更新日期:2015-03-01 00:00:00

  • Word-minimality, epenthesis and coda licensing in the early acquisition of English.

    abstract::Many languages exhibit constraints on prosodic words, where lexical items must be composed of at least two moras of structure, or a binary foot. Demuth and Fee (1995) proposed that children demonstrate early sensitivity to word-minimality effects, exhibiting a period of vowel lengthening or vowel epenthesis if coda co...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309060490020201

    authors: Demuth K,Culbertson J,Alter J

    更新日期:2006-01-01 00:00:00

  • Event-related potentials reflecting the processing of phonological constraint violations.

    abstract::How are violations of phonological constraints processed in word comprehension? The present article reports the results of an event-related potentials (ERP) study on a phonological constraint of German that disallows identical segments within a syllable or word (CC(i)VC(i)). We examined three types of monosyllabic lat...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830909336581

    authors: Domahs U,Kehrein W,Knaus J,Wiesel R,Schlesewsky M

    更新日期:2009-01-01 00:00:00

  • Detection of target phonemes in spontaneous and read speech.

    abstract::Although spontaneous speech occurs more frequently in most listeners' experience than read speech, laboratory studies of human speech recognition typically use carefully controlled materials read from a script. The phonological and prosodic characteristics of spontaneous and read speech differ considerably, however, w...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383098803100203

    authors: Mehta G,Cutler A

    更新日期:1988-04-01 00:00:00

  • Does vowel inventory density affect vowel-to-vowel coarticulation?

    abstract::This study tests the output constraints hypothesis that languages with a crowded phonemic vowel space would allow less vowel-to-vowel coarticulation than languages with a sparser vowel space to avoid perceptual confusion. Mandarin has fewer vowel phonemes than Cantonese, but their allophonic vowel spaces are similarly...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830912443948

    authors: Mok PP

    更新日期:2013-06-01 00:00:00

  • Syllable onset intervals as an indicator of discourse and syntactic boundaries in Taiwan Mandarin.

    abstract::This study looks at the syllable onset interval (SOI) patterning in Taiwan Mandarin spontaneous speech and its relationship to discourse and syntactic units. Monologs were elicited by asking readers to tell stories depicted in comic strips and were transcribed and segmented into Discourse Segment Units (Grosz & Sidner...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309040470010301

    authors: Fon J,Johnson K

    更新日期:2004-01-01 00:00:00

  • The perception of phonological quantity based on durational cues by native speakers, second-language users and nonspeakers of Finnish.

    abstract::Some languages, such as Finnish, use speech-sound duration as the primary cue for a phonological quantity distinction. For second-language (L2) learners, quantity is often difficult to master if speech-sound duration plays a less important role in the phonology of their native language (L1). By comparing the categoriz...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309050480030401

    authors: Ylinen S,Shestakova A,Alku P,Huotilainen M

    更新日期:2005-01-01 00:00:00

  • Stress in ASL: empirical evidence and linguistic issues.

    abstract::The study of signed languages provides an opportunity to identify those characteristics of language that are universal and to investigate the effect of production modality (signed vs. spoken) on the grammar. Over time, American Sign Language (ASL) has accommodated itself to the production and perception requirements o...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309990420020501

    authors: Wilbur RB

    更新日期:1999-04-01 00:00:00

  • Coordination and interpretation of vocal and visible resources: 'trail-off' conjunctions.

    abstract::The empirical focus of this paper is a conversational turn-taking phenomenon in which conjunctions produced immediately after a point of possible syntactic and pragmatic completion are treated by co-participants as points of possible completion and transition relevance. The data for this study are audio-video recordin...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830911428858

    authors: Walker G

    更新日期:2012-03-01 00:00:00

  • The relationship between the perception and production of coarticulation during a sound change in progress.

    abstract::The present study is concerned with lax /u/-fronting in Standard British English and in particular with whether this sound change in progress can be attributed to a waning of the perceptual compensation for the coarticulatory effects of context. Younger and older speakers produced various monosyllables in which /u/ oc...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830911422194

    authors: Kleber F,Harrington J,Reubold U

    更新日期:2012-09-01 00:00:00

  • The Prosody of Rhetorical and Information-Seeking Questions in German.

    abstract::This paper reports on the prosody of rhetorical questions (RQs) and information-seeking questions (ISQs) in German for two question types-polar questions and constituent questions (henceforth "wh-questions"). The results are as follows: Phonologically, polar RQs were mainly realized with H-% (high plateau), while pola...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830918816351

    authors: Braun B,Dehé N,Neitsch J,Wochner D,Zahner K

    更新日期:2019-12-01 00:00:00

  • Spontaneous speech events in two speech databases of human-computer and human-human dialogs in Spanish.

    abstract::Previous works in English have revealed that disfluencies follow regular patterns and that incorporating them into the language model of a speech recognizer leads to lower perplexities and sometimes to a better performance. Although work on disfluency modeling has been applied outside the English community (e.g., in J...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309060490030201

    authors: Rodríguez LJ,Inés Torres M

    更新日期:2006-01-01 00:00:00

  • Finding Phrases: The Interplay of Word Frequency, Phrasal Prosody and Co-speech Visual Information in Chunking Speech by Monolingual and Bilingual Adults.

    abstract::The audiovisual speech signal contains multimodal information to phrase boundaries. In three artificial language learning studies with 12 groups of adult participants we investigated whether English monolinguals and bilingual speakers of English and a language with opposite basic word order (i.e., in which objects pre...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830919842353

    authors: de la Cruz-Pavía I,Werker JF,Vatikiotis-Bateson E,Gervain J

    更新日期:2020-06-01 00:00:00

  • An analysis of turn-taking and backchannels based on prosodic and syntactic features in Japanese map task dialogs.

    abstract::In this study, we investigate syntactic and prosodic features of the speaker's speech at points where turn-taking and backchannels occur, on the basis of our analysis of Japanese spontaneous dialogs. Specifically, we focus on features such as part of speech, duration, F0 contour pattern, relative height of the peak F0...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383099804100404

    authors: Koiso H,Horiuchi Y,Tutiya S,Ichikawa A,Den Y

    更新日期:1998-07-01 00:00:00

  • Effects of vowel duration and vowel quality on vowel-to-vowel coarticulation.

    abstract::This work investigates how vowel duration and vowel quality affect degrees of vowel-to-vowel coarticulation. The effects of these two factors on vowel-to-vowel coarticulation have previously received little study. Phonological durational differences due to vowel length distinction were examined in Thai. It was hypothe...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830911404961

    authors: Mok PP

    更新日期:2011-12-01 00:00:00

  • Native and Non-native Perception of Stress in Mapudungun: Assessing Structural Maintenance in the Phonology of an Endangered Language.

    abstract::Today, virtually all speakers of Mapudungun (formerly Araucanian), an endangered language of Chile and Argentina, are bilingual in Spanish. As a result, the firmness of native speaker intuitions-especially regarding perceptually complex issues such as word-stress-has been called into question. Even though native intui...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830916628899

    authors: Molineaux BJ

    更新日期:2017-03-01 00:00:00

  • Linking Variation in Perception and Production in Sound Change: Evidence from Dutch Obstruent Devoicing.

    abstract::This study investigates the link between the perception and production in sound change in progress, both at the regional and the individual level. Two devoicing processes showing regional variation in Dutch are studied: the devoicing of initial labiodental fricatives and of initial bilabial stops. Five regions were se...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830919880206

    authors: Pinget AF,Kager R,Van de Velde H

    更新日期:2020-09-01 00:00:00

  • The function of sentence accents and given/new information in speech processing: different strategies for normal-hearing and hearing-impaired listeners?

    abstract::Two experiments were carried out to investigate how the correspondence between sentence accentuation and distribution of information is used in human word processing. A forced-choice task with target words embedded in sentences was employed for this purpose. Target words provided either 'given' or 'new' information, a...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/002383099403700403

    authors: van Donselaar W,Lentz J

    更新日期:1994-10-01 00:00:00

  • Sorry, Not Sorry: The independent role of multiple phonetic cues in signaling the difference between two word meanings.

    abstract::We examine the use of multiple subphonemic differences distinguishing homophones in production and perception, through a case study focusing on the distinction between two polysemous senses of the English word "sorry" (apology vs. attention-seeking). An analysis of production data from voice actors revealed significan...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830921988975

    authors: Martinuzzi C,Schertz J

    更新日期:2021-01-28 00:00:00

  • What does more time buy you? Another look at the effects of long-term residence on production accuracy of English /inverted r/ and /l/ by Japanese speakers.

    abstract::This study tested the issue of whether extended length of residence (LOR) in adulthood can provide sufficient input to overcome age effects. The study replicates Flege, Takagi, and Mann (1995), which found that 10 out of 12 Japanese learners of English with extensive residence (12 years or more) produced liquids as ac...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309060490040401

    authors: Larson-Hall J

    更新日期:2006-01-01 00:00:00

  • Interaction of native- and second-language vowel system(s) in early and late bilinguals.

    abstract::The objective of this study was to determine how bilinguals' age at the time of language acquisition influenced the organization of their phonetic system(s). The productions of six English and five Korean vowels by English and Korean monolinguals were compared to the productions of the same vowels by early and late Ko...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/00238309050480010101

    authors: Baker W,Trofimovich P

    更新日期:2005-01-01 00:00:00

  • Identification of acoustically modified Mandarin tones by non-native listeners.

    abstract::This study investigated identification of fragmented Mandarin tones by non-native listeners. Monosyllabic Mandarin words were digitally processed to generate intact, silent-center, center-only, and onset-only syllables. The syllables were recorded with two carrier phrases such that the offset of the carrier tone and t...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830909357160

    authors: Lee CY,Tao L,Bond ZS

    更新日期:2010-01-01 00:00:00

  • Communicative Success in Spatial Dialogue: The Impact of Functional Features and Dialogue Strategies.

    abstract::This paper addresses the impact of dialogue strategies and functional features of spatial arrangements on communicative success. To examine the sharing of cognition between two minds in order to achieve a joint goal, we collected a corpus of 24 extended German-language dialogues in a referential communication task tha...

    journal_title:Language and speech

    pub_type: 杂志文章

    doi:10.1177/0023830916651097

    authors: Tenbrink T,Andonova E,Schole G,Coventry KR

    更新日期:2017-06-01 00:00:00