Detecting Deception from Gaze and Speech Using a Multimodal Attention LSTM-Based Framework

被引:14
|
作者
Gallardo-Antolin, Ascension [1 ]
Montero, Juan M. [2 ]
机构
[1] Univ Carlos III Madrid, Dept Signal Theory & Commun, Avda Univ 30, Madrid 28911, Spain
[2] Univ Politecn Madrid, ETSIT, Speech Technol Grp, Avda Complutense 30, Madrid 28040, Spain
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 14期
关键词
deception detection; multimodal; gaze; speech; LSTM; attention; fusion; SYSTEM;
D O I
10.3390/app11146393
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The automatic detection of deceptive behaviors has recently attracted the attention of the research community due to the variety of areas where it can play a crucial role, such as security or criminology. This work is focused on the development of an automatic deception detection system based on gaze and speech features. The first contribution of our research on this topic is the use of attention Long Short-Term Memory (LSTM) networks for single-modal systems with frame-level features as input. In the second contribution, we propose a multimodal system that combines the gaze and speech modalities into the LSTM architecture using two different combination strategies: Late Fusion and Attention-Pooling Fusion. The proposed models are evaluated over the Bag-of-Lies dataset, a multimodal database recorded in real conditions. On the one hand, results show that attentional LSTM networks are able to adequately model the gaze and speech feature sequences, outperforming a reference Support Vector Machine (SVM)-based system with compact features. On the other hand, both combination strategies produce better results than the single-modal systems and the multimodal reference system, suggesting that gaze and speech modalities carry complementary information for the task of deception detection that can be effectively exploited by using LSTMs.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] LSTM-Based Kazakh Speech Synthesis
    Kaliyev, Arman
    SPEECH AND COMPUTER, SPECOM 2019, 2019, 11658 : 201 - 208
  • [2] A LSTM-Based Joint Progressive Learning Framework for Simultaneous Speech Dereverberation and Denoising
    XinTang
    JunDu
    LiChai
    Wang, Yannan
    Wang, Qing
    Lee, Chin-Hui
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 274 - 278
  • [3] Extracting Different Levels of Speech Information from EEG Using an LSTM-Based Model
    Monesi, Mohammad Jalilpour
    Accou, Bernd
    Francart, Tom
    Van Hamme, Hugo
    INTERSPEECH 2021, 2021, : 526 - 530
  • [4] LSTM-Based Speech Segmentation for TTS Synthesis
    Hanzlicek, Zdenek
    Vit, Jakub
    Tihelka, Daniel
    TEXT, SPEECH, AND DIALOGUE (TSD 2019), 2019, 11697 : 361 - 372
  • [5] LieToMe: An LSTM-Based Method for Deception Detection by Hand Movements
    Avola, Danilo
    Cinque, Luigi
    De Marsico, Maria
    Di Mambro, Angelo
    Fagioli, Alessio
    Foresti, Gian Luca
    Lanzino, Romeo
    Scarcello, Francesco
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT I, 2023, 14233 : 387 - 398
  • [6] A Robust Framework for Speech Emotion Recognition Using Attention Based Convolutional Peephole LSTM
    Paramasivam, Ramya
    Lavanya, K.
    Divakarachari, Parameshachari Bidare
    Camacho, David
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2025,
  • [7] An Initial Risk Assessment for Multimodal with LSTM-Based Trust Evaluation Framework for Autonomous Vehicle Security
    P. N. Renjith
    S. Balasubramani
    K. Ramesh
    Eswar Patnala
    SN Computer Science, 6 (2)
  • [8] On combining acoustic and modulation spectrograms in an attention LSTM-based system for speech intelligibility level classification
    Gallardo-Antolin, Ascension
    Montero, Juan M.
    NEUROCOMPUTING, 2021, 456 : 49 - 60
  • [9] LSTM-Based Framework for the Synthesis of Original Soundtracks
    Huo, Yuanzhi
    Jin, Mengjie
    You, Sicong
    IEEE ACCESS, 2024, 12 : 33832 - 33842
  • [10] LSTM-Based Language Models for Spontaneous Speech Recognition
    Medennikov, Ivan
    Bulusheva, Anna
    SPEECH AND COMPUTER, 2016, 9811 : 469 - 475