Matching Pursuit and Sparse Coding for Auditory Representation

被引:6
|
作者
Tran, Dung Kim [1 ]
Unoki, Masashi [1 ]
机构
[1] Japan Adv Inst Sci & Technol, Grad Sch Adv Sci & Technol, Nomi, Ishikawa 9231292, Japan
关键词
Kernel; Time-frequency analysis; Spectrogram; Matching pursuit algorithms; Bandwidth; Dictionaries; Psychoacoustics; Auditory filterbank; equivalent rectangular bandwidth; gammatone; gammachirp; masking effect; matching pursuit; perceptual features; sparse coding; spectrogram; spikegram; RECOGNITION; FILTER; DOMAIN;
D O I
10.1109/ACCESS.2021.3135011
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Previous studies have revealed that by mimicking the neural activity patterns of the auditory periphery to obtain perceptual features of speech signals, the resultant auditory representation is beneficial to speech-coding and pattern-analysis applications in comparison with spectrogram and spikegram representations. However, current solutions use outdated techniques such as the Bark scale and gammatone basis to decompose speech signals. We propose a method of using more physiological accurate techniques such as the equivalent rectangular bandwidth scale, gammachirp basis, and auditory masking effects of gammachirp kernels. Our experimental results indicate that the auditory representation created with our proposed method requires the lowest bitrate (1066 coefficients per second on average) to achieve similar perceptual evaluation scores (0.89 PEMO-Q and 3.27 PESQ scores) compared with spectrogram and spikegram representations. The proposed method also provides the highest matching accuracy with a pattern-matching algorithm.
引用
收藏
页码:167084 / 167095
页数:12
相关论文
共 50 条
  • [21] An Improved Sparse Representation Based on Local Orthogonal Matching Pursuit for Bearing Compound Fault Diagnosis
    Yi, Cai
    Ran, Le
    Tang, Jiayin
    Jin, Hang
    Zhuang, Zhe
    Zhou, Qiuyang
    Lin, Jianhui
    IEEE SENSORS JOURNAL, 2022, 22 (22) : 21911 - 21923
  • [22] Bio-Inspired Sparse Representation of Speech and Audio Using Psychoacoustic Adaptive Matching Pursuit
    Petrovsky, Alexey
    Herasimovich, Vadzim
    Petrovsky, Alexander
    SPEECH AND COMPUTER, 2016, 9811 : 156 - 164
  • [23] Orthogonal Matching Pursuit for Sparse Quantile Regression
    Aravkin, Aleksandr
    Lozano, Aurelie
    Luss, Ronny
    Kambadur, Prabhanjan
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2014, : 11 - 19
  • [24] Sparse classification using Group Matching Pursuit
    Zheng, Shuai
    Ding, Chris
    NEUROCOMPUTING, 2019, 338 (83-91) : 83 - 91
  • [25] Sparse approximation using fast matching pursuit
    Gan, Tao
    He, Yanmin
    Zhu, Weile
    2007 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, VOLS 1 AND 2, 2007, : 407 - 410
  • [26] ADAPTIVE MATCHING PURSUIT FOR SPARSE SIGNAL RECOVERY
    Vu, Tiep H.
    Mousavi, Hojjat S.
    Monga, Vishal
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4331 - 4335
  • [27] Palmprint Recognition via Sparse Coding Spatial Pyramid Matching Representation of SIFT Feature
    Liu, Ligang
    Zhang, Jianxin
    Yang, Aoqi
    BIOMETRIC RECOGNITION, 2016, 9967 : 235 - 243
  • [28] A novel video coding framework by perceptual representation and macroblock-based matching pursuit algorithm
    Zhang, Jianning
    Sun, Lifeng
    Zhong, Yuzhuo
    ADVANCES IN MULTIMEDIA MODELING, PT 1, 2007, 4351 : 322 - 331
  • [29] Action-Based Pedestrian Identification via Hierarchical Matching Pursuit and Order Preserving Sparse Coding
    Si-Bao Chen
    Yi Xin
    Bin Luo
    Cognitive Computation, 2016, 8 : 797 - 805
  • [30] SPARSE REPRESENTATION OF HUMAN AUDITORY SYSTEM
    Edalatian, Mohammad
    Soltani, Ali Asghar
    Faraji, Neda
    2016 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2016, : 302 - 306