Decoding Attention from Gaze: A Benchmark Dataset and End-to-End Models

被引:0
|
作者
Uppal, Karan [1 ]
Kim, Jaeah [2 ]
Singh, Shashank [3 ]
机构
[1] Indian Inst Technol, Kharagpur, W Bengal, India
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Max Planck Inst Intelligent Syst, Tubingen, Germany
来源
GAZE MEETS MACHINE LEARNING WORKSHOP, VOL 210 | 2022年 / 210卷
基金
美国国家科学基金会;
关键词
Gaze; Eye-Tracking; Deep Learning; Attentional Decoding; VISUAL WORLD PARADIGM; MOUNTED EYE-TRACKING;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Eye-tracking has potential to provide rich behavioral data about human cognition in ecologically valid environments. However, analyzing this rich data is often challenging. Most automated analyses are specific to simplistic artificial visual stimuli with well-separated, static regions of interest, while most analyses in the context of complex visual stimuli, such as most natural scenes, rely on laborious and time-consuming manual annotation. This paper studies using computer vision tools for "attention decoding", the task of assessing the locus of a participant's overt visual attention over time. We provide a publicly available Multiple Object Eye-Tracking (MOET) dataset, consisting of gaze data from participants tracking specific objects, annotated with labels and bounding boxes, in crowded real-world videos, for training and evaluating attention decoding algorithms. We also propose two end-to-end deep learning models for attention decoding and compare these to state-of-the-art heuristic methods.
引用
收藏
页码:219 / 240
页数:22
相关论文
共 50 条
  • [1] JOINT ENDPOINTING AND DECODING WITH END-TO-END MODELS
    Chang, Shuo-Yiin
    Prabhavalkar, Rohit
    He, Yanzhang
    Sainath, Tara N.
    Simko, Gabor
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5626 - 5630
  • [2] Towards End-to-End Embroidery Style Generation: A Paired Dataset and Benchmark
    Ye, Jingwen
    Ji, Yixin
    Song, Jie
    Feng, Zunlei
    Song, Mingli
    PATTERN RECOGNITION AND COMPUTER VISION, PT IV, 2021, 13022 : 201 - 213
  • [3] Joint CTC/attention decoding for end-to-end speech recognition
    Hori, Takaaki
    Watanabe, Shinji
    Hershey, John R.
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 518 - 529
  • [4] Guiding Attention in End-to-End Driving Models
    Porres, Diego
    Xiao, Yi
    Villalonga, Gabriel
    Levy, Alexandre
    Lopez, Antonio M.
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 2353 - 2360
  • [5] DiQAD: A Benchmark Dataset for End-to-End Open-domain Dialogue Assessment
    Zhao, Yukun
    Yang, Lingyong
    Sun, Weiwei
    Meng, Chong
    Wang, Shuaiqiang
    Cheng, Zhicong
    Ren, Zhaochun
    Yin, Dawei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 15128 - 15145
  • [6] Efficient decoding self-attention for end-to-end speech synthesis
    Zhao, Wei
    Xu, Li
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2022, 23 (07) : 1127 - 1138
  • [7] AN ANALYSIS OF DECODING FOR ATTENTION-BASED END-TO-END MANDARIN SPEECH RECOGNITION
    Jiang, Dongwei
    Zou, Wei
    Zhao, Shuaijiang
    Yang, Guilin
    Li, Xiangang
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 384 - 388
  • [8] Maximum-a-Posteriori-Based Decoding for End-to-End Acoustic Models
    Kanda, Naoyuki
    Lu, Xugang
    Kawai, Hisashi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (05) : 1023 - 1034
  • [9] Adversarial training and decoding strategies for end-to-end neural conversation models
    Hori, Takaaki
    Wang, Wen
    Koji, Yusuke
    Hori, Chiori
    Harsham, Bret
    Hershey, John R.
    COMPUTER SPEECH AND LANGUAGE, 2019, 54 : 122 - 139
  • [10] Improved training of end-to-end attention models for speech recognition
    Zeyer, Albert
    Irie, Kazuki
    Schlueter, Ralf
    Ney, Hermann
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 7 - 11