Automatic Speech-Based Smoking Status Identification

被引：1

作者：

Ma, Zhizhong ^{[1
]}

Singh, Satwinder ^{[1
]}

Qiu, Yuanhang ^{[1
]}

Hou, Feng ^{[1
]}

Wang, Ruili ^{[1
]}

Bullen, Christopher ^{[2
]}

Chu, Joanna Ting Wai ^{[2
]}

机构：

[1] Massey Univ, Sch Math & Computat Sci, Auckland, New Zealand

[2] Univ Auckland, Natl Inst Hlth Innovat, Auckland, New Zealand

来源：

INTELLIGENT COMPUTING, VOL 3 | 2022年 / 508卷

关键词：

Smoking status identification; Speech processing; Acoustic features; CIGARETTE-SMOKING; VOICE;

D O I：

10.1007/978-3-031-10467-1_11

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Identifying the smoking status of a speaker from speech has a range of applications including smoking status validation, smoking cessation tracking, and speaker profiling. Previous research on smoking status identification mainly focuses on employing the speaker's low-level acoustic features such as fundamental frequency (F-0), jitter, and shimmer. However, the use of high-level acoustic features, such as Mel Frequency Cepstral Coefficients (MFCC) and filter bank (Fbank) for smoking status identification, has rarely been explored. In this study, we utilise both high-level acoustic features (i.e., MFCC, Fbank) and low-level acoustic features (i.e., F-0, jitter, shimmer) for smoking status identification. Furthermore, we propose a deep neural network approach for smoking status identification by employing ResNet along with these acoustic features. We also explore a data augmentation technique for smoking status identification to further improve the performance. Finally, we present a comparison of identification accuracy results for each feature settings, and obtain the best accuracy of 82.3%, a relative improvement of 12.7% and 29.8% on the initial audio classification approach and rule-based approach, respectively.

引用

页码：193 / 203

页数：11

共 50 条

[1] Speech-Based Automated Cognitive Status Assessment
Hakkani-Tuer, Dilek
Vergyri, Dimitra
Tur, Gokhan
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 258 - +
[2] Automatic Speech-Based Classification of Gender, Age and Accent
Phuoc Nguyen
Tran, Dat
Huang, Xu
Sharma, Dharmendra
KNOWLEDGE MANAGEMENT AND ACQUISITION FOR SMART SYSTEMS AND SERVICES, 2010, 6232 : 288 - 299
[3] Glottal Source Features for Automatic Speech-based Depression Assessment
Simantiraki, Olympia
Charonyktakis, Paulos
Pampouchidou, Anastasia
Tsiknakis, Manolis
Cooker, Martin
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2700 - 2704
[4] Speech-Based Automatic Recognition Technology for Major Depression Disorder
Yang, Zhixin
Li, Hualiang
Li, Li
Zhang, Kai
Xiong, Chaolin
Liu, Yuzhong
HUMAN CENTERED COMPUTING, 2019, 11956 : 546 - 553
[5] Speech-based services
Furman, DS
Cosky, MJ
Thomson, DL
O'Brien, SA
Sumner, EE
BELL LABS TECHNICAL JOURNAL, 1999, 4 (02) : 88 - 97
[6] Towards an Automatic Speech-Based Diagnostic Test for Alzheimer's Disease
Sadeghian, Roozbeh
Schaffer, J. David
Zahorian, Stephen A.
FRONTIERS IN COMPUTER SCIENCE, 2021, 3
[7] AUTOMATIC SEGMENTATION AND LABELING OF SPEECH-BASED ON HIDDEN MARKOV-MODELS
BRUGNARA, F
FALAVIGNA, D
OMOLOGO, M
SPEECH COMMUNICATION, 1993, 12 (04) : 357 - 370
[8] Fully Automatic Speech-Based Analysis of the Semantic Verbal Fluency Task
Konig, Alexandra
Linz, Nicklas
Troeger, Johannes
Wolters, Maria
Alexandersson, Jan
Robert, Phillipe
DEMENTIA AND GERIATRIC COGNITIVE DISORDERS, 2018, 45 (3-4) : 198 - 209
[9] Differential Performance of Automatic Speech-Based Depression Classification Across Smartphones
Stasak, Brian
Epps, Julien
2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2017, : 171 - 175
[10] WHITMAN AND SPEECH-BASED PROSODY
JARVIS, DR
WALT WHITMAN REVIEW, 1981, 27 (02): : 51 - 62

← 1 2 3 4 5 →