Parallel convolutional neural network and hybrid architectures for accented speech recognition in Malayalam

被引:0
|
作者
Rizwana Kallooravi Thandil [1 ]
V. K. Muneer [1 ]
B. Premjith [2 ]
机构
[1] University of Calicut,Amrita School of Artificial Intelligence
[2] Amrita Vishwa Vidyapeetham,undefined
关键词
Accented speech recognition; Malayalam speech recognition; Speech signal preprocessing; Speech data augmentation; Dimensionality reduction; Speech feature extraction; Neural networks;
D O I
10.1007/s42044-024-00212-w
中图分类号
学科分类号
摘要
This study investigates different approaches to recognizing accented speech for the Malayalam language, a language spoken in the southern region of India. A dataset was constructed for different language accents to conduct the study since there were no freely available datasets in the domain. The data collected has been preprocessed by applying band-pass filters and audio normalization. The speech dataset has been augmented using time-stretching, pitch shifting, and adding Gaussian noise. A total of 585 acoustic features have been extracted from the speech signals using adaptive fast Fourier transform (FFT) window size, spectral contrast, Tonnetz and polyfeatures, harmonic-to-noise ratio (HNR) and formants, zero-crossing rate (ZCR) and short-term Fourier transform, root mean square (RMS) and Mel spectrogram, and Mel-frequency cepstral coefficients (MFCC) and its deltas. Five accented models were constructed using a 2D parallel convolutional neural network (CNN), 4D parallel CNN without attention block, 4D parallel CNN with attention block, Bidirectional long short-term memory, and CNN–long short-term memory hybrid methods. The accented models constructed using 4D Parallel with attention block and hybrid CNN–long short-term memory architecture exhibited better performance with high accuracy and low error rates among all the five model architectures.
引用
收藏
页码:125 / 149
页数:24
相关论文
共 50 条
  • [1] SIMPLIFYING VERY DEEP CONVOLUTIONAL NEURAL NETWORK ARCHITECTURES FOR ROBUST SPEECH RECOGNITION
    Rownicka, Joanna
    Renals, Steve
    Bell, Peter
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 236 - 243
  • [2] Malayalam Handwritten Character Recognition Using Convolutional Neural Network
    Nair, Pranav P.
    James, Ajay
    Saravanan, C.
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICICCT), 2017, : 278 - 281
  • [3] Speech Emotion Recognition through Hybrid Features and Convolutional Neural Network
    Alluhaidan, Ala Saleh
    Saidani, Oumaima
    Jahangir, Rashid
    Nauman, Muhammad Asif
    Neffati, Omnia Saidani
    APPLIED SCIENCES-BASEL, 2023, 13 (08):
  • [4] Evolution of Neural Network Architectures for Speech Recognition
    Bourlard, Herve
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1767 - 1767
  • [5] Implementation of Convolutional Neural Network for Speech Recognition
    Wang, Zhichao
    Na, Xingyu
    Liu, Yong
    Pan, Jielin
    Yan, Yonghong
    INTERNATIONAL ACADEMIC CONFERENCE ON THE INFORMATION SCIENCE AND COMMUNICATION ENGINEERING (ISCE 2014), 2014, : 239 - 243
  • [6] Convolutional Neural Networks for the Recognition of Malayalam Characters
    Anil, R.
    Manjusha, K.
    Kumar, S. Sachin
    Soman, K. P.
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON FRONTIERS OF INTELLIGENT COMPUTING: THEORY AND APPLICATIONS (FICTA) 2014, VOL 2, 2015, 328 : 493 - 500
  • [7] Simplified neural network architectures for a hybrid speech recognition system with small vocabulary size
    Sedarat, H
    Khadem, R
    Franco, H
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 1113 - 1116
  • [8] A Hybrid convolutional neural network for sketch recognition
    Zhang, Xingyuan
    Huang, Yaping
    Zou, Qi
    Pei, Yanting
    Zhang, Runsheng
    Wang, Song
    PATTERN RECOGNITION LETTERS, 2020, 130 : 73 - 82
  • [9] Deep Convolutional Neural Network for Arabic Speech Recognition
    Amari, Rafik
    Noubigh, Zouhaira
    Zrigui, Salah
    Berchech, Dhaou
    Nicolas, Henri
    Zrigui, Mounir
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 120 - 134
  • [10] Crossmixed convolutional neural network for digital speech recognition
    Diep, Quoc Bao
    Phan, Hong Yen
    Truong, Thanh-Cong
    PLOS ONE, 2024, 19 (04):