Selective Biasing with Trie-based Contextual Adapters for Personalised Speech Recognition using Neural Transducers

被引:1
|
作者
Harding, Philip [1 ]
Tong, Sibo [1 ]
Wiesler, Simon [1 ]
机构
[1] Amazon Alexa, Munich, Germany
来源
关键词
speech recognition; contextual biasing; personalisation;
D O I
10.21437/Interspeech.2023-739
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Neural transducer ASR models achieve state of the art accuracy on many tasks, however rare word recognition poses a particular challenge as models often fail to recognise words that occur rarely, or not at all, in the training data. Methods of contextual biasing, where models are dynamically adapted to bias their outputs towards a given list of relevant words and phrases, have been shown to be effective at alleviating this issue. While such methods are effective at improving rare word recognition, over-biasing can lead to degradation on common words. In this work we propose several extensions to a recently proposed trie-based method of contextual biasing. We show how performance of the method can be improved in terms of rare word recognition, especially in the case of very large catalogues, by introducing a simple normalisation term, how the method can be trained as an adapter module, and how selective biasing can be applied to practically eliminate over-biasing on common words.
引用
收藏
页码:256 / 260
页数:5
相关论文
共 50 条
  • [41] Speech Emotion Recognition Using Multichannel Parallel Convolutional Recurrent Neural Networks based on Gammatone Auditory Filterbank
    Peng, Zhichao
    Zhu, Zhi
    Unoki, Masashi
    Dang, Jianwu
    Akagi, Masato
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1750 - 1755
  • [42] Vowel, digit and continuous speech recognition based on statistical, neural and hybrid modelling by using ASRS_RL
    Dumitru, Corneliu Octavian
    Gavat, Inge
    EUROCON 2007: THE INTERNATIONAL CONFERENCE ON COMPUTER AS A TOOL, VOLS 1-6, 2007, : 670 - 677
  • [43] Convolution neural network based automatic speech emotion recognition using Mel-frequency Cepstrum coefficients
    Pawar, Manju D.
    Kokate, Rajendra D.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (10) : 15563 - 15587
  • [44] Mask-based Beamforming Using Complex-valued Neural Network for Recognition of Spatial Target Speech
    Hayakawa, Daichi
    Kagoshima, Takehiko
    Fujimura, Hiroshi
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 23 - 29
  • [45] Convolution neural network based automatic speech emotion recognition using Mel-frequency Cepstrum coefficients
    Manju D. Pawar
    Rajendra D. Kokate
    Multimedia Tools and Applications, 2021, 80 : 15563 - 15587
  • [46] End-To-End Speech Emotion Recognition Based on Time and Frequency Information Using Deep Neural Networks
    Bakhshi, Ali
    Wong, Aaron S. W.
    Chalup, Stephan
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 969 - 975
  • [47] Deep Neural Networks for Sensor-Based Human Activity Recognition Using Selective Kernel Convolution
    Gao, Wenbin
    Zhang, Lei
    Huang, Wenbo
    Min, Fuhong
    He, Jun
    Song, Aiguo
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
  • [48] Optimal feature selection based speech emotion recognition using two-stream deep convolutional neural network
    Mustaqeem
    Kwon, Soonil
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (09) : 5116 - 5135
  • [49] Deep Neural Network for visual Emotion Recognition based on ResNet50 using Song-Speech characteristics
    Ayadi, Souha
    Lachiri, Zied
    PROCEEDINGS OF THE 2022 5TH INTERNATIONAL CONFERENCE ON ADVANCED SYSTEMS AND EMERGENT TECHNOLOGIES IC_ASET'2022), 2022, : 363 - 368
  • [50] Exploring the performance of automatic speaker recognition using twin speech and deep learning-based artificial neural networks
    Cavalcanti, Julio Cesar
    da Silva, Ronaldo Rodrigues
    Eriksson, Anders
    Barbosa, Plinio A.
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 7