Depressive Tendency Recognition by Fusing Speech and Text Features: A Comparative Analysis

被引:0
|
作者
He, Yimin [1 ]
Lu, Xiaoyong [1 ]
Yuan, Jingyi [1 ]
Pan, Tao [1 ]
Wang, Yafan [1 ]
机构
[1] Northwest Normal Univ Lanzhou, Lanzhou, Gansu, Peoples R China
来源
2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2022年
基金
美国国家科学基金会;
关键词
Depression recognition; Deep learning; GRU; Multimodal;
D O I
10.1109/ISCSLP57327.2022.10038078
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Depression will be accompanied by long-term depression, loss of interest, excessive guilt, and other states, seriously affecting people's physical and mental health, causing certain harm to the individual family and society. Depressed people speak slowly, single tone, and the pause time is longer. At the same time, the expression of emotion is often accompanied by many negative words. Therefore, the combination of speech and text information using Gated Recurrent Neural Network for depression prediction, this fusion method from the signal layer to the language layer to analyze the data. It is more comprehensive than only use a single speech feature or text feature model. At the same time, the comparative analysis of different speech types, three kinds of emotional stimulus corpus, and gender was carried out. The multimodal system is more effective than a single modality, and speech type, emotional stimulus, and gender have certain effects on depressive recognition.
引用
收藏
页码:344 / 348
页数:5
相关论文
共 50 条
  • [21] Sentiment Analysis by Fusing Text and Location Features of Geo-Tagged Tweets
    Lim, Wei Lun
    Ho, Chiung Ching
    Ting, Choo-Yee
    IEEE ACCESS, 2020, 8 : 181014 - 181027
  • [22] Recognition of Major Depressive Disorder Based on Facial Behavior and Speech Fusion Features
    Li J.
    Chen Q.
    Ding Z.
    Liu Z.
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2023, 46 (01): : 32 - 37
  • [23] Speech emotion recognition based on the reconstruction of acoustic and text features in latent space
    Santoso, Jennifer
    Sekiguchi, Rintaro
    Yamada, Takeshi
    Ishizuka, Kenkichi
    Hashimoto, Taiichi
    Makino, Shoji
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1678 - 1683
  • [24] Text-independent speech emotion recognition using frequency adaptive features
    Wu, Chenjian
    Huang, Chengwei
    Chen, Hong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (18) : 24353 - 24363
  • [25] Multi-modal Emotion Recognition using Speech Features and Text Embedding
    Kim J.-H.
    Lee S.-P.
    Transactions of the Korean Institute of Electrical Engineers, 2021, 70 (01): : 108 - 113
  • [26] Text-independent speech emotion recognition using frequency adaptive features
    Chenjian Wu
    Chengwei Huang
    Hong Chen
    Multimedia Tools and Applications, 2018, 77 : 24353 - 24363
  • [27] Fusing Facial Texture Features for Face Recognition
    Shao, Yanqing
    Tang, Chaowei
    Xiao, Min
    Tang, Hui
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES INDIA SECTION A-PHYSICAL SCIENCES, 2016, 86 (03) : 395 - 403
  • [28] Fusing DCT and LBP features for face recognition
    Li, Jian-Ke
    Zhao, Bao-Jun
    Zhang, Hui
    Jiao, Ji-Chao
    Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2010, 30 (11): : 1355 - 1359
  • [29] Face recognition fusing global and local features
    Yu, Wei-wei
    Teng, Xiao-long
    Liu, Chong-qing
    JOURNAL OF ELECTRONIC IMAGING, 2006, 15 (01)
  • [30] Fusing Facial Texture Features for Face Recognition
    Yanqing Shao
    Chaowei Tang
    Min Xiao
    Hui Tang
    Proceedings of the National Academy of Sciences, India Section A: Physical Sciences, 2016, 86 : 395 - 403