Attention Weight Smoothing Using Prior Distributions for Transformer-Based End-to-End ASR

被引:0
|
作者
Maekaku, Takashi [1 ]
Fujita, Yuya [1 ]
Peng, Yifan [2 ]
Watanabe, Shinji [2 ]
机构
[1] Yahoo Japan Corporation, Tokyo, Japan
[2] Carnegie Mellon University, PA, United States
关键词
751.5; Speech;
D O I
暂无
中图分类号
学科分类号
摘要
29
引用
收藏
页码:1071 / 1075
相关论文
共 50 条
  • [31] A study of transformer-based end-to-end speech recognition system for Kazakh language
    Mamyrbayev, Orken
    Oralbekova, Dina
    Alimhan, Keylan
    Turdalykyzy, Tolganay
    Othman, Mohamed
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [32] Transformer-Based End-to-End Classification of Variable-Length Volumetric Data
    Oghbaie, Marzieh
    Araujo, Teresa
    Emre, Taha
    Schmidt-Erfurth, Ursula
    Bogunovic, Hrvoje
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VI, 2023, 14225 : 358 - 367
  • [33] TransOrga: End-To-End Multi-modal Transformer-Based Organoid Segmentation
    Qin, Yiming
    Li, Jiajia
    Chen, Yulong
    Wang, Zikai
    Huang, Yu-An
    You, Zhuhong
    Hu, Lun
    Hu, Pengwei
    Tan, Feng
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT III, 2023, 14088 : 460 - 472
  • [34] TOD-Net: An end-to-end transformer-based object detection network
    Sirisha, Museboyina
    Sudha, S. V.
    COMPUTERS & ELECTRICAL ENGINEERING, 2023, 108
  • [35] TRANSFORMER-BASED STREAMING ASR WITH CUMULATIVE ATTENTION
    Li, Mohan
    Zhang, Shucong
    Zorila, Catalin
    Doddipatla, Rama
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8272 - 8276
  • [36] Self-Distillation into Self-Attention Heads for Improving Transformer-based End-to-End Neural Speaker Diarization
    Jeoung, Ye-Rin
    Choi, Jeong-Hwan
    Seong, Ju-Seok
    Kyung, JeHyun
    Chang, Joon-Hyuk
    INTERSPEECH 2023, 2023, : 3197 - 3201
  • [37] STREAMING BILINGUAL END-TO-END ASR MODEL USING ATTENTION OVER MULTIPLE SOFTMAX
    Patil, Aditya
    Joshi, Vikas
    Agrawal, Purvi
    Mehta, Rupesh
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 252 - 259
  • [38] End-to-End Asbestos Roof Detection on Orthophotos Using Transformer-Based YOLO Deep Neural Network
    Pace, Cesare Davide
    Bria, Alessandro
    Focareta, Mariano
    Lozupone, Gabriele
    Marrocco, Claudio
    Meoli, Giuseppe
    Molinara, Mario
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT I, 2023, 14233 : 232 - 244
  • [39] Leveraging Text Data Using Hybrid Transformer-LSTM Based End-to-End ASR in Transfer Learning
    Zeng, Zhiping
    Pham, Van Tung
    Xu, Haihua
    Khassanov, Yerbolat
    Chng, Eng Siong
    Ni, Chongjia
    Ma, Bin
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [40] OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images
    Zhao, Jiaqi
    Ding, Zeyu
    Zhou, Yong
    Zhu, Hancheng
    Du, Wen-Liang
    Yao, Rui
    El Saddik, Abdulmotaleb
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62