Residual-Network-Based Supervised Gaze Prediction for First-Person Videos

被引：1

作者：

Li, Yujie ^{[1
]}

Ding, Shuxue ^{[2
]}

Li, Xiang ^{[3
]}

Tan, Benying ^{[1
,3
]}

Kanemura, Atsunori ^{[1
,4
,5
]}

机构：

[1] Natl Inst Adv Ind Sci & Technol, Tsukuba, Ibaraki 3058560, Japan

[2] Guilin Univ Elect Technol, Sch Artificial Intelligence, Guilin 541004, Peoples R China

[3] Univ Aizu, Sch Comp Sci & Engn, Aizu Wakamatsu, Fukushima 9650005, Japan

[4] LeapMind Inc, Tokyo 1500044, Japan

[5] Adv Telecommun Res Inst Int, Kyoto 6190288, Japan

来源：

IEEE ACCESS | 2019年 / 7卷

基金：

日本学术振兴会;

关键词：

Gaze prediction; first-person vision (FPV); saliency detection; convolution neural network (CNN); residual network; SALIENCY;

D O I：

10.1109/ACCESS.2019.2913791

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Gaze prediction is a significant problem in efficiently processing and understanding a large number of incoming visual signals from first-person views (i.e., egocentric vision). Because many visual processes are expensive and human beings do not process the whole visual field, thus knowing the gaze position is an efficient way to understand the salient content of a video and what users pay attention to. However, current methods for gaze prediction are bottom-up methods and cannot incorporate information about user actions. We proposed a supervised gaze prediction framework based on a residual network, which takes the gaze of user action into consideration. Our model uses the features extracted from the VGG-16 deep neural network to predict the gaze position in FPV videos. The deep residual networks are introduced to combine with this model for learning the residual maps. Our proposed method attempts to obtain gaze prediction results with high accuracy. According to the experimental results, the performance of our proposed gaze prediction method is competitive with that of the state-of-the-art approaches.

引用

页码：56208 / 56216

页数：9

共 50 条

[21] Measuring and Improving the Viewing Experience of First-person Videos
Ma, Biao
Reibman, Amy R.
PROCEEDINGS OF THE THEMATIC WORKSHOPS OF ACM MULTIMEDIA 2017 (THEMATIC WORKSHOPS'17), 2017, : 493 - 501
[22] Unsupervised Traffic Accident Detection in First-Person Videos
Yao, Yu
Xu, Mingze
Wang, Yuchen
Crandall, David J.
Atkins, Ella M.
2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 273 - 280
[23] Identifying First-person Camera Wearers in Third-person Videos
Fan, Chenyou
Lee, Jangwon
Xu, Mingze
Singh, Krishna Kumar
Lee, Yong Jae
Crandall, David J.
Ryoo, Michael S.
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4734 - 4742
[24] Automatic Gaze Analysis in Multiparty Conversations based on Collective First-Person Vision
Kumano, Shiro
Otsuka, Kazuhiro
Ishii, Ryo
Yamato, Junji
2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG): EMOTION REPRESENTATION, ANALYSIS AND SYNTHESIS IN CONTINUOUS TIME AND SPACE (EMOSPACE 2015), VOL 5, 2015,
[25] Robot-Centric Activity Prediction from First-Person Videos: What Will They Do to Me?
Ryoo, M. S.
Fuchs, Thomas J.
Xia, Lu
Aggarwa, J. K.
Matthies, Larry
PROCEEDINGS OF THE 2015 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI'15), 2015, : 295 - 302
[26] MAKING THIRD PERSON TECHNIQUES RECOGNIZE FIRST-PERSON ACTIONS IN EGOCENTRIC VIDEOS
Verma, Sagar
Nagar, Pravin
Gupta, Divam
Arora, Chetan
2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 2301 - 2305
[27] A Graph-Theoretic Framework for Summarizing First-Person Videos
Sahu, Abhimanyu
Chowdhury, Ananda S.
GRAPH-BASED REPRESENTATIONS IN PATTERN RECOGNITION, GBRPR 2019, 2019, 11510 : 183 - 193
[28] Unsupervised Learning of Important Objects from First-Person Videos
Bertasius, Gedas
Park, Hyun Soo
Yu, Stella X.
Shi, Jianbo
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1974 - 1982
[29] Musical Hyperlapse: A Multimodal Approach to Accelerate First-Person Videos
de Matos, Diognei
Ramos, Washington
Romanhol, Luiz
Nascimento, Erickson R.
2021 34TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI 2021), 2021, : 184 - 191
[30] Ego-Action Analysis for First-Person Sports Videos
Kitani, Kris
IEEE PERVASIVE COMPUTING, 2012, 11 (02) : 92 - 95

← 1 2 3 4 5 →