FOAP projects

Research Topics

Information Status:
Different researchers have demonstrated that verbal prosody, in languages such as Dutch and English, can be used to distinguish important bits of information from less important ones. This is achieved by means of the distribution of pitch accents which, in these languages, can mark information which is ‘new’ or ‘contrastive’, whereas ‘given’ information, i.e., information which is known from the context, tends to be deaccented. However, it is known that these findings do not generalize to other languages such as Italian which strongly resists deaccentuation or Japanese where accents are lexically determined. In the visual domain, there is preliminary work which suggests that rapid eyebrow movements can play an accentuation role, too. That there is a connection between the eyebrow and the pitch movements appears from the fact that they tend to co-occur, and it is a difficult exercise for speakers to have the two positioned at different locations in a sentence. Yet, it is unnatural if the eyebrow movements of a talking head covary completely with pitch, so that it is an open issue which factors determine the occurrence of eyebrow movements. Our first observations suggest that the eyebrow movements play a less significant role than the pitch accents for signaling focus, and are only really informative when pitch does not provide a clear cue, though the visual cues do have an impact on the perceived degree of prominence of a pitch accent. Also, first explorations of real human speakers suggest that speakers also use visual cues in the mouth area and head nods to signal prominence.

For this research topic the following publications and MA project reports are available for download. Also, check out the demos for this topic.

top

Feedback:
Daily life conversations abound with different forms of feedback that signal whether or not a communication problem has occurred in a prior exchange. Broadly speaking, one can distinguish between positive cues versus negative cues to signal whether or not the dialogue is running smoothly. Past studies have shown that these two forms of backchanneling differ prosodically, basically in that the negative forms have more ‘marked’ verbal prosodic settings (higher, louder, longer, slower) . Our first analyses of human-machine interactions show that users of a spoken dialogue system use marked facial expressions to cue problematic dialogue events. Conversely, combinations of smile, eye closure, nodding, eyebrow movement on the one hand and melodic and temporal cues on the other hand can provide convincing feedback cues from a synthetic talking head in human-machine interaction on whether it accepts an utterance from a user or rather signals that it has some problems understanding that utterance.

For this research topic the following publications and MA project reports are available for download. Also, check out the demos for this topic.

top

Emotion and attitude:
Given that a normal conversation basically is a social activity which often occurs in informal settings, aspects of interpersonal relations between conversants are reflected in their linguistic behaviour. For instance, speakers adapt their speaking style dynamically in the course of the interaction as a function of their emotional state (How certain is a speaker? Is the speaker happy or sad? Etc.) . Speakers find it important how they are perceived by their addressees, for instance in that they have audiovisual strategies to save face. Interestingly, adult speakers appear to behave quite differently in this respect than young 8-year old children. Previous studies have brought to light that different degrees of ‘arousal’ are reflected in the prosodic features of a speaker’s utterance: especially pitch range variation (i.e., whether speakers are talking in a relatively high or low voice) appears to be a significant correlate. Seminal work by Ekman and Friesen shows that different facial expressions can be very informative about a person’s internal emotional state.

For this research topic the following publications and MA project reports are available for download. Also, check out the demos for this topic.

top

Turn taking:
Conversations are characterized by a turn-taking mechanism which tends to proceed very well with minimal delay between consecutive speaking turns and rare cases of real overlap between two speakers. Past studies have brought to light that particular combinations of lexical, syntactic and prosodic information can function as cues for signalling that a speaker wants to keep the floor or wants to end the turn. In addition to cues at the edges of these turns, we have in our own work also looked at more global cues to turn-taking and shown that speakers can use pitch range to presignal the end of a turn, so that it may become clear some time before a possible transition point whether or not a speaker wants to finish a speaking turn. Earlier sociolinguistic work on face-to-face interactions suggests that there exist additional visual cues (in particular eye contact and head movements) to turn-taking: it seems roughly true that a speaker will break mutual gaze while speaking, returning gaze to the addressee upon turn completion. It is unknown, however, how these visual cues relate to auditive and lexico-syntactic cues.

For this research topic the following publications and MA project reports are available for download.

top