APSIPA Transactions on Signal and Information Processing > Vol 7 > Issue 1

Engagement recognition by a latent character model based on multimodal listener behaviors in spoken dialogue

Koji Inoue, Yoshida-honmachi, Japan, inoue@sap.ist.i.kyoto-u.ac.jp , Divesh Lala, Yoshida-honmachi, Japan, Katsuya Takanashi, Yoshida-honmachi, Japan, Tatsuya Kawahara, Yoshida-honmachi, Japan
 
Suggested Citation
Koji Inoue, Divesh Lala, Katsuya Takanashi and Tatsuya Kawahara (2018), "Engagement recognition by a latent character model based on multimodal listener behaviors in spoken dialogue", APSIPA Transactions on Signal and Information Processing: Vol. 7: No. 1, e9. http://dx.doi.org/10.1017/ATSIP.2018.11

Publication Date: 12 Sep 2018
© 2018 Koji Inoue, Divesh Lala, Katsuya Takanashi and Tatsuya Kawahara
 
Subjects
 
Keywords
EngagementMultimodalListener behaviorsLatent variable modelDialogue
 

Share

Open Access

This is published under the terms of the Creative Commons Attribution licence.

Downloaded: 1143 times

In this article:
I. INTRODUCTION 
II. RELATED WORKS 
III. DIALOGUE DATA AND ANNOTATION OF ENGAGEMENT 
IV. LATENT CHARACTER MODEL 
V. ONLINE PROCESSING 
VI. EXPERIMENTAL EVALUATIONS 
VII. CONCLUSION 

Abstract

Engagement represents how much a user is interested in and willing to continue the current dialogue. Engagement recognition will provide an important clue for dialogue systems to generate adaptive behaviors for the user. This paper addresses engagement recognition based on multimodal listener behaviors of backchannels, laughing, head nodding, and eye gaze. In the annotation of engagement, the ground-truth data often differs from one annotator to another due to the subjectivity of the perception of engagement. To deal with this, we assume that each annotator has a latent character that affects his/her perception of engagement. We propose a hierarchical Bayesian model that estimates both engagement and the character of each annotator as latent variables. Furthermore, we integrate the engagement recognition model with automatic detection of the listener behaviors to realize online engagement recognition. Experimental results show that the proposed model improves recognition accuracy compared with other methods which do not consider the character such as majority voting. We also achieve online engagement recognition without degrading accuracy.

DOI:10.1017/ATSIP.2018.11