LOW-ACTIVITY SUPERVISED CONVOLUTIONAL SPIKING NEURAL NETWORKS APPLIED TO SPEECH COMMANDS RECOGNITION

被引:25
|
作者
Pellegrini, Thomas [1 ]
Zimmer, Romain [1 ,2 ]
Masquelier, Timothee [2 ]
机构
[1] Univ Toulouse, IRIT, Toulouse, France
[2] Univ Toulouse 3, CNRS, CERCO UMR 5549, Toulouse, France
关键词
Spiking neural networks; surrogate gradient; speech command recognition;
D O I
10.1109/SLT48900.2021.9383587
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep Neural Networks (DNNs) are the current state-of-the-art models in many speech related tasks. There is a growing interest, though, for more biologically realistic, hardware friendly and energy efficient models, named Spiking Neural Networks (SNNs). Recently, it has been shown that SNNs can be trained efficiently, in a supervised manner, using backpropagation with a surrogate gradient trick. In this work, we report speech command (SC) recognition experiments using supervised SNNs. We explored the Leaky-Integrate-Fire (LIF) neuron model for this task, and show that a model comprised of stacked dilated convolution spiking layers can reach an error rate very close to standard DNNs on the Google SC v1 dataset: 5.5%, while keeping a very sparse spiking activity, below 5%, thank to a new regularization term. We also show that modeling the leakage of the neuron membrane potential is useful, since the LIF model outperformed its non-leaky model counterpart significantly.
引用
收藏
页码:97 / 103
页数:7
相关论文
共 50 条
  • [31] IMPROVING CONVOLUTIONAL RECURRENT NEURAL NETWORKS FOR SPEECH EMOTION RECOGNITION
    Meyer, Patrick
    Xu, Ziyi
    Fingscheidt, Tim
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 365 - 372
  • [32] VERY DEEP CONVOLUTIONAL NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Qian, Yanmin
    Woodland, Philip C.
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 481 - 488
  • [33] Gender Differentiated Convolutional Neural Networks for Speech Emotion Recognition
    Mishra, Puneet
    Sharma, Ruchir
    2020 12TH INTERNATIONAL CONGRESS ON ULTRA MODERN TELECOMMUNICATIONS AND CONTROL SYSTEMS AND WORKSHOPS (ICUMT 2020), 2020, : 142 - 148
  • [34] FSER: Deep Convolutional Neural Networks for Speech Emotion Recognition
    Dossou, Bonaventure F. P.
    Gbenou, Yeno K. S.
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 3526 - 3531
  • [35] SPEECH EMOTION RECOGNITION USING QUATERNION CONVOLUTIONAL NEURAL NETWORKS
    Muppidi, Aneesh
    Radfar, Martin
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6309 - 6313
  • [36] Efficient GPU implementation of convolutional neural networks for speech recognition
    van den Berg, Ewout
    Brand, Daniel
    Bordawekar, Rajesh
    Rachevsky, Leonid
    Ramabhadran, Bhuvana
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1483 - 1487
  • [37] Speech Recognition of Punjabi Numerals Using Convolutional Neural Networks
    Aditi, Thakur
    Karun, Verma
    ADVANCES IN COMPUTER COMMUNICATION AND COMPUTATIONAL SCIENCES, VOL 1, 2019, 759 : 61 - 69
  • [38] Speech Emotion Recognition using Convolutional and Recurrent Neural Networks
    Lim, Wootaek
    Jang, Daeyoung
    Lee, Taejin
    2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [39] Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition
    Yongqiang Cao
    Yang Chen
    Deepak Khosla
    International Journal of Computer Vision, 2015, 113 : 54 - 66
  • [40] Competitive Maximization of Neuronal Activity in Convolutional Recurrent Spiking Neural Networks
    Nekhaev, Dmitry
    Demin, Vyacheslav
    ADVANCES IN NEURAL COMPUTATION, MACHINE LEARNING, AND COGNITIVE RESEARCH III, 2020, 856 : 255 - 262