Decoding Attention from Gaze: A Benchmark Dataset and End-to-End Models

被引:0
|
作者
Uppal, Karan [1 ]
Kim, Jaeah [2 ]
Singh, Shashank [3 ]
机构
[1] Indian Inst Technol, Kharagpur, W Bengal, India
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Max Planck Inst Intelligent Syst, Tubingen, Germany
来源
GAZE MEETS MACHINE LEARNING WORKSHOP, VOL 210 | 2022年 / 210卷
基金
美国国家科学基金会;
关键词
Gaze; Eye-Tracking; Deep Learning; Attentional Decoding; VISUAL WORLD PARADIGM; MOUNTED EYE-TRACKING;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Eye-tracking has potential to provide rich behavioral data about human cognition in ecologically valid environments. However, analyzing this rich data is often challenging. Most automated analyses are specific to simplistic artificial visual stimuli with well-separated, static regions of interest, while most analyses in the context of complex visual stimuli, such as most natural scenes, rely on laborious and time-consuming manual annotation. This paper studies using computer vision tools for "attention decoding", the task of assessing the locus of a participant's overt visual attention over time. We provide a publicly available Multiple Object Eye-Tracking (MOET) dataset, consisting of gaze data from participants tracking specific objects, annotated with labels and bounding boxes, in crowded real-world videos, for training and evaluating attention decoding algorithms. We also propose two end-to-end deep learning models for attention decoding and compare these to state-of-the-art heuristic methods.
引用
收藏
页码:219 / 240
页数:22
相关论文
共 50 条
  • [31] STREAMING ATTENTION-BASED MODELS WITH AUGMENTED MEMORY FOR END-TO-END SPEECH RECOGNITION
    Yeh, Ching-Feng
    Wang, Yongqiang
    Shi, Yangyang
    Wu, Chunyang
    Zhang, Frank
    Chan, Julian
    Seltzer, Michael L.
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 8 - 14
  • [32] Attention-based End-to-End Models for Small-Footprint Keyword Spotting
    Shan, Changhao
    Zhang, Junbo
    Wang, Yujun
    Xie, Lei
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2037 - 2041
  • [33] Investigating Joint CTC-Attention Models for End-to-End Russian Speech Recognition
    Markovnikov, Nikita
    Kipyatkova, Irina
    SPEECH AND COMPUTER, SPECOM 2019, 2019, 11658 : 337 - 347
  • [34] STREAMING END-TO-END SPEECH RECOGNITION WITH JOINT CTC-ATTENTION BASED MODELS
    Moritz, Niko
    Hori, Takaaki
    Le Roux, Jonathan
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 936 - 943
  • [35] END-TO-END SPEECH RECOGNITION FROM FEDERATED ACOUSTIC MODELS
    Gao, Yan
    Parcollet, Titouan
    Zaiem, Salah
    Fernandez-Marques, Javier
    de Gusmao, Pedro P. B.
    Beutel, Daniel J.
    Lane, Nicholas D.
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7227 - 7231
  • [36] DISTILLING KNOWLEDGE FROM ENSEMBLES OF ACOUSTIC MODELS FOR JOINT CTC-ATTENTION END-TO-END SPEECH RECOGNITION
    Gao, Yan
    Parcollet, Titouan
    Lane, Nicholas D.
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 138 - 145
  • [37] Forward-Backward Decoding for Regularizing End-to-End TTS
    Zheng, Yibin
    Wang, Xi
    He, Lei
    Pan, Shifeng
    Soong, Frank K.
    Wen, Zhengqi
    Tao, Jianhua
    INTERSPEECH 2019, 2019, : 1283 - 1287
  • [38] A dataset of regressions in web applications detected by end-to-end tests
    Óscar Soto-Sánchez
    Michel Maes-Bermejo
    Micael Gallego
    Francisco Gortázar
    Software Quality Journal, 2022, 30 : 425 - 454
  • [39] A dataset of regressions in web applications detected by end-to-end tests
    Soto-Sanchez, Oscar
    Maes-Bermejo, Michel
    Gallego, Micael
    Gortazar, Francisco
    SOFTWARE QUALITY JOURNAL, 2022, 30 (02) : 425 - 454
  • [40] END-TO-END PERSON SEARCH SEQUENTIALLY TRAINED ON AGGREGATED DATASET
    Loesch, Angelique
    Rabarisoa, Jaonary
    Audigier, Romaric
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 4574 - 4578