Automatic Speech-Based Smoking Status Identification

被引:1
|
作者
Ma, Zhizhong [1 ]
Singh, Satwinder [1 ]
Qiu, Yuanhang [1 ]
Hou, Feng [1 ]
Wang, Ruili [1 ]
Bullen, Christopher [2 ]
Chu, Joanna Ting Wai [2 ]
机构
[1] Massey Univ, Sch Math & Computat Sci, Auckland, New Zealand
[2] Univ Auckland, Natl Inst Hlth Innovat, Auckland, New Zealand
来源
关键词
Smoking status identification; Speech processing; Acoustic features; CIGARETTE-SMOKING; VOICE;
D O I
10.1007/978-3-031-10467-1_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Identifying the smoking status of a speaker from speech has a range of applications including smoking status validation, smoking cessation tracking, and speaker profiling. Previous research on smoking status identification mainly focuses on employing the speaker's low-level acoustic features such as fundamental frequency (F-0), jitter, and shimmer. However, the use of high-level acoustic features, such as Mel Frequency Cepstral Coefficients (MFCC) and filter bank (Fbank) for smoking status identification, has rarely been explored. In this study, we utilise both high-level acoustic features (i.e., MFCC, Fbank) and low-level acoustic features (i.e., F-0, jitter, shimmer) for smoking status identification. Furthermore, we propose a deep neural network approach for smoking status identification by employing ResNet along with these acoustic features. We also explore a data augmentation technique for smoking status identification to further improve the performance. Finally, we present a comparison of identification accuracy results for each feature settings, and obtain the best accuracy of 82.3%, a relative improvement of 12.7% and 29.8% on the initial audio classification approach and rule-based approach, respectively.
引用
收藏
页码:193 / 203
页数:11
相关论文
共 50 条
  • [41] Speech-Based Activity Recognition for Trauma Resuscitation
    Abdulbaqi, Jalal
    Gu, Yue
    Xu, Zhichao
    Gao, Chenyang
    Marsic, Ivan
    Burd, Randall S.
    2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 376 - 383
  • [42] Verifying Human Users in Speech-Based Interactions
    Shirali-Shahreza, Sajad
    Ganjali, Yashar
    Balakrishnan, Ravin
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1596 - 1599
  • [43] Effect of Reverberation in Speech-based Emotion Recognition
    Zhao, Shujie
    Yang, Yan
    Chen, Jingdong
    2018 IEEE INTERNATIONAL CONFERENCE ON THE SCIENCE OF ELECTRICAL ENGINEERING IN ISRAEL (ICSEE), 2018,
  • [44] An architecture and applications for speech-based accessibility systems
    Turunen, M
    Hakulinen, J
    Räihä, KJ
    Salonen, EP
    Kainulainen, A
    Prusi, P
    IBM SYSTEMS JOURNAL, 2005, 44 (03) : 485 - 504
  • [45] Speech-based cognitive load monitoring system
    Yin, Bo
    Chen, Fang
    Ruiz, Natalie
    Ambikairajah, Eliathamby
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 2041 - 2044
  • [46] Browsing the web from a speech-based interface
    Poon, J
    Nunn, C
    HUMAN-COMPUTER INTERACTION - INTERACT'01, 2001, : 302 - 309
  • [47] An investigation of speech-based human emotion recognition
    Wang, YJ
    Guan, L
    2004 IEEE 6TH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2004, : 15 - 18
  • [48] VoiceWriting: a completely speech-based text editor
    De Marsico, Maria
    Mattei, Francesca Romana
    PROCEEDINGS OF THE 14TH BIANNUAL CONFERENCE OF THE ITALIAN SIGCHI CHAPTER (CHIITALY 2021), 2021,
  • [49] Towards Robust Speech-Based Emotion Recognition
    Tabatabaei, Talieh S.
    Krishnan, Sridhar
    2010 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2010), 2010,
  • [50] Portable Speech-based Aids for Blind Persons
    Kordon, U.
    Informationstechnik und Technische Informatik, 39 (02):