Gender Recognition Based on the Stacking of Different Acoustic Features

被引：1

作者：

Yuecesoy, Erguen ^{[1
]}

机构：

[1] Ordu Univ, Vocat Sch Tech Sci, TR-52200 Ordu, Turkiye

来源：

APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 15期

关键词：

gender recognition; hybrid features; MFCC; KNN; LDA; CNN; MLP; machine learning; deep learning;

D O I：

10.3390/app14156564

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

A speech signal can provide various information about a speaker, such as their gender, age, accent, and emotional state. The gender of the speaker is the most salient piece of information contained in the speech signal and is directly or indirectly used in many applications. In this study, a new approach is proposed for recognizing the gender of the speaker based on the use of hybrid features created by stacking different types of features. For this purpose, four different features, namely Mel frequency cepstral coefficients (MFCC), Mel scaled power spectrogram (Mel Spectrogram), Chroma, Spectral contrast (Contrast), and Tonal Centroid (Tonnetz), and twelve hybrid features created by stacking these features were used. These features were applied to four different classifiers, two of which were based on traditional machine learning (KNN and LDA) while two were based on the deep learning approach (CNN and MLP), and the performance of each was evaluated separately. In the experiments conducted on the Turkish subset of the Common Voice dataset, it was observed that hybrid features, created by stacking different acoustic features, led to improvements in gender recognition accuracy ranging from 0.3 to 1.73%.

引用

页数：13

共 50 条

[21] Gender recognition based on Adaboost face detection integrated with facial features
Ma, L. (maliqianmemory@gmail.com), 1600, Huazhong University of Science and Technology (41):
[22] Gender Recognition via Fused Silhouette Features Based on Visual Sensors
Sun Bei
Jiang Deng
Zuo Zhen
Su Shaojing
IEEE SENSORS JOURNAL, 2019, 19 (20) : 9496 - 9503
[23] Maximum mutual information based acoustic-features representation of phonological features for speech recognition
Omar, MK
Hasegawa-Johnson, M
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 81 - 84
[24] A Hybrid Ensemble Stacking Model for Gender Voice Recognition Approach
Alkhammash, Eman H.
Hadjouni, Myriam
Elshewey, Ahmed M.
ELECTRONICS, 2022, 11 (11)
[25] Speech Recognition and Acoustic Features in Combined Electric and Acoustic Stimulation
Yoon, Yang-Soo
Li, Yongxin
Fu, Qian-Jie
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2012, 55 (01): : 105 - 124
[26] Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features
Lin, Ju
Xie, Yanlu
Gao, Yingming
Zhang, Jinsong
2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
[27] Speech emotion recognition based on the reconstruction of acoustic and text features in latent space
Santoso, Jennifer
Sekiguchi, Rintaro
Yamada, Takeshi
Ishizuka, Kenkichi
Hashimoto, Taiichi
Makino, Shoji
PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1678 - 1683
[28] Variational mode decomposition based acoustic and entropy features for speech emotion recognition
Mishra, Siba Prasad
Warule, Pankaj
Deb, Suman
APPLIED ACOUSTICS, 2023, 212
[29] Emotions Recognition System for Acoustic Music Data Based on Human Perception Features
Endrjukaite, Tatiana
Kiyoki, Yasushi
INFORMATION MODELLING AND KNOWLEDGE BASES XXVIII, 2017, 292 : 283 - 302
[30] An Underwater Acoustic Target Recognition Method Based on Spectrograms with Different Resolutions
Luo, Xinwei
Zhang, Minghong
Liu, Ting
Huang, Ming
Xu, Xiaogang
JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2021, 9 (11)

← 1 2 3 4 5 →