Depressive Tendency Recognition by Fusing Speech and Text Features: A Comparative Analysis

被引：0

作者：

He, Yimin ^{[1
]}

Lu, Xiaoyong ^{[1
]}

Yuan, Jingyi ^{[1
]}

Pan, Tao ^{[1
]}

Wang, Yafan ^{[1
]}

机构：

[1] Northwest Normal Univ Lanzhou, Lanzhou, Gansu, Peoples R China

来源：

2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2022年

基金：

美国国家科学基金会;

关键词：

Depression recognition; Deep learning; GRU; Multimodal;

D O I：

10.1109/ISCSLP57327.2022.10038078

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Depression will be accompanied by long-term depression, loss of interest, excessive guilt, and other states, seriously affecting people's physical and mental health, causing certain harm to the individual family and society. Depressed people speak slowly, single tone, and the pause time is longer. At the same time, the expression of emotion is often accompanied by many negative words. Therefore, the combination of speech and text information using Gated Recurrent Neural Network for depression prediction, this fusion method from the signal layer to the language layer to analyze the data. It is more comprehensive than only use a single speech feature or text feature model. At the same time, the comparative analysis of different speech types, three kinds of emotional stimulus corpus, and gender was carried out. The multimodal system is more effective than a single modality, and speech type, emotional stimulus, and gender have certain effects on depressive recognition.

引用

页码：344 / 348

页数：5