Decoding Attention from Gaze: A Benchmark Dataset and End-to-End Models

被引：0

作者：

Uppal, Karan ^{[1
]}

Kim, Jaeah ^{[2
]}

Singh, Shashank ^{[3
]}

机构：

[1] Indian Inst Technol, Kharagpur, W Bengal, India

[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[3] Max Planck Inst Intelligent Syst, Tubingen, Germany

来源：

GAZE MEETS MACHINE LEARNING WORKSHOP, VOL 210 | 2022年 / 210卷

基金：

美国国家科学基金会;

关键词：

Gaze; Eye-Tracking; Deep Learning; Attentional Decoding; VISUAL WORLD PARADIGM; MOUNTED EYE-TRACKING;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Eye-tracking has potential to provide rich behavioral data about human cognition in ecologically valid environments. However, analyzing this rich data is often challenging. Most automated analyses are specific to simplistic artificial visual stimuli with well-separated, static regions of interest, while most analyses in the context of complex visual stimuli, such as most natural scenes, rely on laborious and time-consuming manual annotation. This paper studies using computer vision tools for "attention decoding", the task of assessing the locus of a participant's overt visual attention over time. We provide a publicly available Multiple Object Eye-Tracking (MOET) dataset, consisting of gaze data from participants tracking specific objects, annotated with labels and bounding boxes, in crowded real-world videos, for training and evaluating attention decoding algorithms. We also propose two end-to-end deep learning models for attention decoding and compare these to state-of-the-art heuristic methods.

引用

页码：219 / 240

页数：22

共 50 条

[31] STREAMING ATTENTION-BASED MODELS WITH AUGMENTED MEMORY FOR END-TO-END SPEECH RECOGNITION
Yeh, Ching-Feng
Wang, Yongqiang
Shi, Yangyang
Wu, Chunyang
Zhang, Frank
Chan, Julian
Seltzer, Michael L.
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 8 - 14
[32] Attention-based End-to-End Models for Small-Footprint Keyword Spotting
Shan, Changhao
Zhang, Junbo
Wang, Yujun
Xie, Lei
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2037 - 2041
[33] Investigating Joint CTC-Attention Models for End-to-End Russian Speech Recognition
Markovnikov, Nikita
Kipyatkova, Irina
SPEECH AND COMPUTER, SPECOM 2019, 2019, 11658 : 337 - 347
[34] STREAMING END-TO-END SPEECH RECOGNITION WITH JOINT CTC-ATTENTION BASED MODELS
Moritz, Niko
Hori, Takaaki
Le Roux, Jonathan
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 936 - 943
[35] END-TO-END SPEECH RECOGNITION FROM FEDERATED ACOUSTIC MODELS
Gao, Yan
Parcollet, Titouan
Zaiem, Salah
Fernandez-Marques, Javier
de Gusmao, Pedro P. B.
Beutel, Daniel J.
Lane, Nicholas D.
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7227 - 7231
[36] DISTILLING KNOWLEDGE FROM ENSEMBLES OF ACOUSTIC MODELS FOR JOINT CTC-ATTENTION END-TO-END SPEECH RECOGNITION
Gao, Yan
Parcollet, Titouan
Lane, Nicholas D.
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 138 - 145
[37] Forward-Backward Decoding for Regularizing End-to-End TTS
Zheng, Yibin
Wang, Xi
He, Lei
Pan, Shifeng
Soong, Frank K.
Wen, Zhengqi
Tao, Jianhua
INTERSPEECH 2019, 2019, : 1283 - 1287
[38] A dataset of regressions in web applications detected by end-to-end tests
Óscar Soto-Sánchez
Michel Maes-Bermejo
Micael Gallego
Francisco Gortázar
Software Quality Journal, 2022, 30 : 425 - 454
[39] A dataset of regressions in web applications detected by end-to-end tests
Soto-Sanchez, Oscar
Maes-Bermejo, Michel
Gallego, Micael
Gortazar, Francisco
SOFTWARE QUALITY JOURNAL, 2022, 30 (02) : 425 - 454
[40] END-TO-END PERSON SEARCH SEQUENTIALLY TRAINED ON AGGREGATED DATASET
Loesch, Angelique
Rabarisoa, Jaonary
Audigier, Romaric
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 4574 - 4578

← 1 2 3 4 5 →