now publishers - Toward human-centric deep video understanding

APSIPA Transactions on Signal and Information Processing > Vol 9 > Issue 1

Toward human-centric deep video understanding

Industrial Technology Advances

Wenjun Zeng, Microsoft Research Asia, China, wezeng@microsoft.com

Suggested Citation

Wenjun Zeng (2020), "Toward human-centric deep video understanding", APSIPA Transactions on Signal and Information Processing: Vol. 9: No. 1, e1. http://dx.doi.org/10.1017/atsip.2019.26

Publication Date: 13 Jan 2020

Subjects

Keywords

Human-centric, Video understanding, Deep learning

Journal details

Open Access

This is published under the terms of the Creative Commons Attribution licence.

Downloaded: 2761 times

In this article:

Abstract

People are the very heart of our daily work and life. As we strive to leverage artificial intelligence to empower every person on the planet to achieve more, we need to understand people far better than we can today. Human–computer interaction plays a significant role in human-machine hybrid intelligence, and human understanding becomes a critical step in addressing the tremendous challenges of video understanding. In this paper, we share our views on why and how to use a human centric approach to address the challenging video understanding problems. We discuss human-centric vision tasks and their status, highlighting the challenges and how our understanding of human brain functions can be leveraged to effectively address some of the challenges. We show that semantic models, view-invariant models, and spatial-temporal visual attention mechanisms are important building blocks. We also discuss the future perspectives of video understanding.

DOI:10.1017/atsip.2019.26

I. INTRODUCTION
II. HUMAN-CENTRIC: WHY and HOW?
III. HUMAN-CENTRIC VISION TASKS
IV. FUTURE PERSPECTIVES

Toward human-centric deep video understanding

Share

Journal details

Abstract