A human fatigue detection method based on speech spectrogram features

被引:0
|
作者
Li X. [1 ]
Li G. [2 ]
Deng M. [1 ]
Wan P. [1 ]
Yan L. [1 ]
机构
[1] School of Transportation and Logistics, East China Jiaotong University, Nanchang
[2] School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing
关键词
Fusion decision; Gray level co-occurrence matrix; Human fatigue detection; Spectrogram; Speech;
D O I
10.19650/j.cnki.cjsi.J2007210
中图分类号
学科分类号
摘要
To apply the visual image analysis of speech spectrogram to human fatigue detection effectively, a human fatigue detection method based on speech spectrogram features is proposed. Firstly, the influence mechanism analysis of human fatigue on speech spectrogram is analyzed. The Mel frequency stretching transform of speech spectrogram based on the auditory perception theory is used to highlight the region of interest which is susceptible to fatigue. Secondly, the Mel frequency stretched spectrogram is divided into 24 overlapping critical frequency band sub-images, and 15 texture features are extracted from the gray level co-occurrence matrixes of each sub-image in 4 directions to quantitatively describe the fatigue information. Finally, a human fatigue detection model based on multi sub-bands fatigue information fusion is formulated by designing the feature-layer classifier for distribution detecting the features of each critical frequency band. In this way, the fatigue detection result can be achieved, which is based on the decision-level multi-classifiers fusion decision. Experimental results show that the extracted speech spectrogram features have stronger fatigue classification ability than traditional acoustic features. The fatigue detection effectiveness of this method is also better than the existing spectrogram feature recognition methods. © 2021, Science Press. All right reserved.
引用
收藏
页码:123 / 132
页数:9
相关论文
共 18 条
  • [1] XIE P, QI M S, ZHANG Y Y, Et al., Driver fatigue assessment based on multi-physiological signals and transfer learning, Chinese Journal of Scientific Instrument, 39, 10, pp. 223-231, (2018)
  • [2] XIE X, TANG G, XIAO Z P, Et al., Speaking style recognition of pilots in flight, Transactions of Beijing Institute of Technology, 37, 7, pp. 744-747, (2017)
  • [3] ALICIA F R, KLAS I, MARC W, Et al., Towards affect-aware vehicles for increasing safety and comfort: Recognising driver emotions from audio recordings in a realistic driving study, IET Intelligent Transport Systems, 14, 10, pp. 1265-1277, (2020)
  • [4] LI K, GONG Y, REN Z., A fatigue driving detection algorithm based on facial multi-feature fusion, IEEE Access, 8, 6, pp. 101244-101259, (2020)
  • [5] ZHANG ZH, JIANG J, FU J H, Et al., Multi-physiological mental-fatigue detection based on the functional near infrared spectroscopy, Chinese Journal of Scientific Instrument, 38, 6, pp. 1345-1352, (2017)
  • [6] SCHULLER B, STEIDL S, BATLINER A, Et al., Medium-term speaker states-A review on intoxication, sleepiness and the first challenge, Computer Speech and Language, 28, 2, pp. 346-374, (2014)
  • [7] LI X, LI G ZH, PENG L Q, Et al., Driver fatigue detection based on speech feature transfer learning, Journal of China Railway Society, 42, 4, pp. 74-81, (2020)
  • [8] LI X, LI G ZH, SHI J G, Et al., Fatigue driving detection based on speech psychoacoustic analysis, Chinese Journal of Scientific Instrument, 39, 10, pp. 166-175, (2018)
  • [9] HUZAIFAH B M S, WYSE L., Applying visual domain style transfer and texture synthesis techniques to audio: Insights and challenges, Neural Computing and Applications, 32, 4, pp. 1051-1065, (2020)
  • [10] ZHANG Y, DAI S, SONG W, Et al., Exposing speech resampling manipulation by local texture analysis on spectrogram images, Electronics, 9, 1, pp. 231-16, (2020)