Multistream Gaze Estimation with Anatomical Eye Region Isolation by Synthetic to Real Transfer Learning

被引:1
|
作者
Mahmud Z. [1 ,3 ]
Hungler P. [3 ]
Etemad A. [1 ,3 ]
机构
[1] s University, Kingston, Ontario
[2] s University, Kingston, Ontario
来源
关键词
deep neural network; domain randomization; Estimation; eye region segmentation; Feature extraction; Gaze estimation; Head; Iris; Lighting; multistream network; Synthetic data; Training; transfer learning;
D O I
10.1109/TAI.2024.3366174
中图分类号
学科分类号
摘要
We propose a novel neural pipeline, MSGazeNet, that learns gaze representations by taking advantage of the eye anatomy information through a multistream framework. Our proposed solution comprises two components, first a network for isolating anatomical eye regions, and a second network for multistream gaze estimation. The eye region isolation is performed with a U-Net style network which we train using a synthetic dataset that contains eye region masks for the visible eyeball and the iris region. The synthetic dataset used in this stage is procured using the UnityEyes simulator, and consists of 80,000 eye images. Successive to training, the eye region isolation network is then transferred to the real domain for generating masks for the real-world eye images. In order to successfully make the transfer, we exploit domain randomization in the training process, which allows for the synthetic images to benefit from a larger variance with the help of augmentations that resemble artifacts. The generated eye region masks along with the raw eye images are then used together as a multistream input to our gaze estimation network, which consists of wide residual blocks. The output embeddings from these encoders are fused in the channel dimension before feeding into the gaze regression layers. We evaluate our framework on three gaze estimation datasets and achieve strong performances. Our method surpasses the state-of-the-art by 7.57% and 1.85% on two datasets, and obtains competitive results on the other. We also study the robustness of our method with respect to the noise in the data and demonstrate that our model is less sensitive to noisy data. Lastly, we perform a variety of experiments including ablation studies to evaluate the contribution of different components and design choices in our solution. IEEE
引用
收藏
页码:1 / 15
页数:14
相关论文
共 50 条
  • [41] Automatic Eye Type Detection in Retinal Fundus Image Using Fusion of Transfer Learning and Anatomical Features
    Roy, Pallab Kanti
    Chakravorty, Rajib
    Sedai, Suman
    Mahapatra, Dwarikanath
    Garnavi, Rahil
    2016 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2016, : 538 - 544
  • [42] Robust deep learning for eye fundus images: Bridging real and synthetic data for enhancing generalization
    Oliveira, Guilherme C.
    Rosa, Gustavo H.
    Pedronette, Daniel C.G.
    Papa, João P.
    Kumar, Himeesh
    Passos, Leandro A.
    Kumar, Dinesh
    Biomedical Signal Processing and Control, 2024, 94
  • [43] Robust deep learning for eye fundus images: Bridging real and synthetic data for enhancing generalization
    Oliveira, Guilherme C.
    Rosa, Gustavo H.
    Pedronette, Daniel C. G.
    Papa, Joao P.
    Kumar, Himeesh
    Passos, Leandro A.
    Kumar, Dinesh
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 94
  • [44] Real-Time Estimation of Eye Movement Condition Using a Deep Learning Model
    Sugiura, Akihiro
    Itazu, Yoshiki
    Tanaka, Kunihiko
    Takada, Hiroki
    HCI INTERNATIONAL 2021 - LATE BREAKING PAPERS: MULTIMODALITY, EXTENDED REALITY, AND ARTIFICIAL INTELLIGENCE, 2021, 13095 : 132 - 143
  • [45] Estimation of behavioral user state based on eye gaze and head pose-application in an e-learning environment
    Asteriadis, Stylianos
    Tzouveli, Paraskevi
    Karpouzis, Kostas
    Kollias, Stefanos
    MULTIMEDIA TOOLS AND APPLICATIONS, 2009, 41 (03) : 469 - 493
  • [46] Transfer Learning from Synthetic to Real-Noise Denoising with Adaptive Instance Normalization
    Kim, Yoonsik
    Soh, Jae Woong
    Park, Gu Yong
    Cho, Nam Ik
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3479 - 3489
  • [47] TRANSFER LEARNING FROM SYNTHETIC TO REAL IMAGES USING VARIATIONAL AUTOENCODERS FOR PRECISE POSITION DETECTION
    Inoue, Tadanobu
    Chaudhury, Subhajit
    De Magistris, Giovanni
    Dasgupta, Sakyasingha
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 2725 - 2729
  • [48] Driver Gaze Zone Estimation Based on Three-Channel Convolution-Optimized Vision Transformer With Transfer Learning
    Li, Zhao
    Jiang, Siyang
    Fu, Rui
    Guo, Yingshi
    Wang, Chang
    IEEE SENSORS JOURNAL, 2024, 24 (24) : 42064 - 42078
  • [49] Get a Grip: Slippage-Robust and Glint-Free Gaze Estimation for Real-Time Pervasive Head-Mounted Eye Tracking
    Santini, Thiago
    Niehorster, Diederick C.
    Kasneci, Enkelejda
    ETRA 2019: 2019 ACM SYMPOSIUM ON EYE TRACKING RESEARCH & APPLICATIONS, 2019,
  • [50] A robust, real-time camera-based eye gaze tracking system to analyze users' visual attention using deep learning
    Singh, Jaiteg
    Modi, Nandini
    INTERACTIVE LEARNING ENVIRONMENTS, 2024, 32 (02) : 409 - 430