Decoding Attention from Gaze: A Benchmark Dataset and End-to-End Models

被引：0

作者：

Uppal, Karan ^{[1
]}

Kim, Jaeah ^{[2
]}

Singh, Shashank ^{[3
]}

机构：

[1] Indian Inst Technol, Kharagpur, W Bengal, India

[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[3] Max Planck Inst Intelligent Syst, Tubingen, Germany

来源：

GAZE MEETS MACHINE LEARNING WORKSHOP, VOL 210 | 2022年 / 210卷

基金：

美国国家科学基金会;

关键词：

Gaze; Eye-Tracking; Deep Learning; Attentional Decoding; VISUAL WORLD PARADIGM; MOUNTED EYE-TRACKING;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Eye-tracking has potential to provide rich behavioral data about human cognition in ecologically valid environments. However, analyzing this rich data is often challenging. Most automated analyses are specific to simplistic artificial visual stimuli with well-separated, static regions of interest, while most analyses in the context of complex visual stimuli, such as most natural scenes, rely on laborious and time-consuming manual annotation. This paper studies using computer vision tools for "attention decoding", the task of assessing the locus of a participant's overt visual attention over time. We provide a publicly available Multiple Object Eye-Tracking (MOET) dataset, consisting of gaze data from participants tracking specific objects, annotated with labels and bounding boxes, in crowded real-world videos, for training and evaluating attention decoding algorithms. We also propose two end-to-end deep learning models for attention decoding and compare these to state-of-the-art heuristic methods.

引用

页码：219 / 240

页数：22

共 50 条

[1] JOINT ENDPOINTING AND DECODING WITH END-TO-END MODELS
Chang, Shuo-Yiin
Prabhavalkar, Rohit
He, Yanzhang
Sainath, Tara N.
Simko, Gabor
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5626 - 5630
[2] Towards End-to-End Embroidery Style Generation: A Paired Dataset and Benchmark
Ye, Jingwen
Ji, Yixin
Song, Jie
Feng, Zunlei
Song, Mingli
PATTERN RECOGNITION AND COMPUTER VISION, PT IV, 2021, 13022 : 201 - 213
[3] Joint CTC/attention decoding for end-to-end speech recognition
Hori, Takaaki
Watanabe, Shinji
Hershey, John R.
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 518 - 529
[4] Guiding Attention in End-to-End Driving Models
Porres, Diego
Xiao, Yi
Villalonga, Gabriel
Levy, Alexandre
Lopez, Antonio M.
2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 2353 - 2360
[5] DiQAD: A Benchmark Dataset for End-to-End Open-domain Dialogue Assessment
Zhao, Yukun
Yang, Lingyong
Sun, Weiwei
Meng, Chong
Wang, Shuaiqiang
Cheng, Zhicong
Ren, Zhaochun
Yin, Dawei
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 15128 - 15145
[6] Efficient decoding self-attention for end-to-end speech synthesis
Zhao, Wei
Xu, Li
FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2022, 23 (07) : 1127 - 1138
[7] AN ANALYSIS OF DECODING FOR ATTENTION-BASED END-TO-END MANDARIN SPEECH RECOGNITION
Jiang, Dongwei
Zou, Wei
Zhao, Shuaijiang
Yang, Guilin
Li, Xiangang
2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 384 - 388
[8] Maximum-a-Posteriori-Based Decoding for End-to-End Acoustic Models
Kanda, Naoyuki
Lu, Xugang
Kawai, Hisashi
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (05) : 1023 - 1034
[9] Adversarial training and decoding strategies for end-to-end neural conversation models
Hori, Takaaki
Wang, Wen
Koji, Yusuke
Hori, Chiori
Harsham, Bret
Hershey, John R.
COMPUTER SPEECH AND LANGUAGE, 2019, 54 : 122 - 139
[10] Improved training of end-to-end attention models for speech recognition
Zeyer, Albert
Irie, Kazuki
Schlueter, Ralf
Ney, Hermann
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 7 - 11

← 1 2 3 4 5 →