Driver Behavior State Recognition based on Silence Removal Speech

被引:0
|
作者
Kamaruddin, Norhaslinda [1 ]
Rahman, Abdul Wahab Abdul [2 ]
Halim, Khairul Ikhwan Mohamad [1 ]
Noh, Muhammad Hafiq Iqmal Mohd [1 ]
机构
[1] Univ Teknol MARA Melaka, Kampus Jasin, Merlimau 77300, Melaka, Malaysia
[2] Int Islamic Univ Malaysia, Kulliyyah Informat & Commun Technol, Kuala Lumpur, Malaysia
来源
2016 INTERNATIONAL CONFERENCE ON INFORMATICS AND COMPUTING (ICIC) | 2016年
关键词
driver behavior state; silence removal; Zero Crossing Rate; Short Term Energy; Mel Frequency Cepstral Coefficient; Multi Layer Perceptron;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Numerous researches have linked driver behavior to the cause of accident and some studies are concentrated into different input providing practical preventive measures. Nonetheless speech has been found to be a suitable input source in understanding and analyzing driver's behavior state due to the underlying emotional information when the driver speaks and such changes can be measured. However, the massive amount of driving speech data may hinder optimal performance of processing and analyzing the data due to the computational complexity and time constraint. This paper presents a silence removal approach using Short Term Energy (STE) and Zero Crossing Rate (ZCR) prior to extracting the relevant features in order to reduce the computational time in a vehicular environment. Mel Frequency Cepstral Coefficient (MFCC) feature extraction method coupled with Multi Layer Perceptron (MLP) classifier are employed to get the driver behavior state recognition performance. Experimental results demonstrated that the proposed approach is able to obtain comparable performance with accuracy ranging between 58.7% and 76.6% to differentiate four driver behavior states, namely; talking through cell telephone phone, out-burst laughing, sleepy and normal driving. It is envisages that such engine can be extended for a more comprehensive driver behavior identification system that may acts as an embedded warning system for sleepy driver.
引用
收藏
页码:186 / 191
页数:6
相关论文
共 50 条
  • [31] FREQUENCY AND DURATION CHARACTERISTICS OF SPEECH AND SILENCE BEHAVIOR DURING INTERVIEWS
    MATARAZZO, JD
    HESS, HF
    SASLOW, G
    JOURNAL OF CLINICAL PSYCHOLOGY, 1962, 18 (04) : 416 - 426
  • [32] Driver Behavior Analysis through Speech Emotion Understanding
    Kamaruddin, N.
    Wahab, A.
    2010 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2010, : 238 - 243
  • [33] SPEECH AND SILENCE BEHAVIOR OF BILINGUALS CONVERSING IN EACH OF 2 LANGUAGES
    WIENS, AN
    MANUAGH, TS
    MATARAZZO, JD
    LINGUISTICS, 1976, (172) : 79 - 94
  • [34] Silence and speech segmentation for noisy speech using a wavelet based algorithm
    Mei, XD
    Sun, SH
    CHINESE JOURNAL OF ELECTRONICS, 2001, 10 (04): : 439 - 443
  • [35] Silence and speech segmentation for noisy speech using a wavelet based algorithm
    Mei, X., 2001, Chinese Institute of Electronics (10):
  • [36] State-based labelling for a sparse representation of speech and its application to robust speech recognition
    Virtanen, Tuomas
    Gemmeke, Jort F.
    Hurmalainen, Antti
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 893 - +
  • [37] Nonnative Speech Recognition Based on Bilingual Model Modification at State Level
    Zhang, Qingqing
    Pan, Jielin
    Chan, Shui-duen
    Yan, Yonghong
    SIXTH INTERNATIONAL SYMPOSIUM ON NEURAL NETWORKS (ISNN 2009), 2009, 56 : 299 - +
  • [38] English speech emotion recognition method based on speech recognition
    Man Liu
    International Journal of Speech Technology, 2022, 25 : 391 - 398
  • [39] State-based bilingual model modification for nonnative speech recognition
    Zhang, Qingqing
    Li, Ta
    Pan, Jiefin
    Yan, Yonghong
    2008 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING, VOLS 1 AND 2, PROCEEDINGS, 2008, : 1300 - 1304
  • [40] English speech emotion recognition method based on speech recognition
    Liu, Man
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2022, 25 (2) : 391 - 398