APSIPA Transactions on Signal and Information Processing > Vol 5 > Issue 1

Multi-modal sensing and analysis of poster conversations with smart posterboard

Tatsuya Kawahara, Kyoto University, Japan, kawahara@i.kyoto-u.ac.jp , Takuma Iwatate, Kyoto University, Japan, Koji Inoue, Kyoto University, Japan, Soichiro Hayashi, Kyoto University, Japan, Hiromasa Yoshimoto, Kyoto University, Japan, Katsuya Takanashi, Kyoto University, Japan
 
Suggested Citation
Tatsuya Kawahara, Takuma Iwatate, Koji Inoue, Soichiro Hayashi, Hiromasa Yoshimoto and Katsuya Takanashi (2016), "Multi-modal sensing and analysis of poster conversations with smart posterboard", APSIPA Transactions on Signal and Information Processing: Vol. 5: No. 1, e2. http://dx.doi.org/10.1017/ATSIP.2016.2

Publication Date: 02 Mar 2016
© 2016 Tatsuya Kawahara, Takuma Iwatate, Koji Inoue, Soichiro Hayashi, Hiromasa Yoshimoto and Katsuya Takanashi
 
Subjects
 
Keywords
Multi-modal signal processingConversation analysisBehavioral analysisSpeaker diarization
 

Share

Open Access

This is published under the terms of the Creative Commons Attribution licence.

Downloaded: 1365 times

In this article:
I. INTRODUCTION 
II. MULTI-MODAL CORPUS OF POSTER CONVERSATIONS 
III. MULTI-MODAL SENSING WITH SMART POSTERBOARD 
IV. PREDICTION OF TURN-TAKING FROM MULTI-MODAL BEHAVIORS 
V. SPEAKER DIARIZATION USING EYE-GAZE INFORMATION 
VI. PREDICTION OF INTEREST AND COMPREHENSION LEVEL VIA AUDIENCE'S QUESTIONS FROM MULTI-MODAL BEHAVIORS 
VII. CONCLUSIONS 

Abstract

Conversations in poster sessions in academic events, referred to as poster conversations, pose interesting, and challenging topics on multi-modal signal and information processing. We have developed a smart posterboard for multi-modal recording and analysis of poster conversations. The smart posterboard has multiple sensing devices to record poster conversations, so we can review who came to the poster and what kind of questions or comments he/she made. The conversation analysis incorporates face and eye-gaze tracking for effective speaker diarization. It is demonstrated that eye-gaze information is useful for predicting turn-taking and also improving speaker diarization. Moreover, high-level indexing of interest and comprehension level of the audience is explored based on the multi-modal behaviors during the conversation. This is realized by predicting the audience's speech acts such as questions and reactive tokens.

DOI:10.1017/ATSIP.2016.2