THE ROYALFLUSH AUTOMATIC SPEECH DIARIZATION AND RECOGNITION SYSTEM FOR IN-CAR MULTI-CHANNEL AUTOMATIC SPEECH RECOGNITION CHALLENGE

被引:0
|
作者
Tian, Jingguang [1 ]
Ye, Shuaishuai [1 ]
Chen, Shunfei [1 ]
Xiang, Yang [1 ]
Yin, Zhaohui [1 ]
Hu, Xinhui [1 ]
Xu, Xinkang [1 ]
机构
[1] Hithink RoyalFlush AI Res Inst, Hangzhou, Zhejiang, Peoples R China
关键词
ICMC-ASR; ASDR; TS-VAD; speaker diarization; speech recognition;
D O I
10.1109/ICASSPW62465.2024.10626136
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents our system submission for the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) Challenge, which focuses on speaker diarization and speech recognition in complex multi-speaker scenarios. To address these challenges, we develop end-to-end speaker diarization models that notably decrease the diarization error rate (DER) by 49.58% compared to the official baseline on the development set. For speech recognition, we utilize self-supervised learning representations to train end-to-end ASR models. By integrating these models, we achieve a character error rate (CER) of 16.93% on the track 1 evaluation set, and a concatenated minimum permutation character error rate (cpCER) of 25.88% on the track 2 evaluation set.
引用
收藏
页码:1 / 2
页数:2
相关论文
共 50 条
  • [41] Automatic Speech Recognition System on DSP Board
    Huang, Guo-Shing
    Tian, Zhi-Hao
    2016 INTERNATIONAL AUTOMATIC CONTROL CONFERENCE (CACS), 2016, : 224 - 226
  • [42] Automatic Speech Recognition System Dedicated for Polish
    Ziolko, Mariusz
    Galka, Jakub
    Ziolko, Bartosz
    Jadczyk, Tomasz
    Skurzok, Dawid
    Masior, Mariusz
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 3322 - 3323
  • [43] Automatic speech recognition system for Tunisian dialect
    Abir Masmoudi
    Fethi Bougares
    Mariem Ellouze
    Yannick Estève
    Lamia Belguith
    Language Resources and Evaluation, 2018, 52 : 249 - 267
  • [44] Interface of an Automatic Recognition System for Dysarthric Speech
    Zaidi, Brahim-Fares
    Boudraa, Malika
    Selouani, Sid-Ahmed
    Addou, Djamel
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (09) : 560 - 564
  • [45] Automatic speech recognition system for Tunisian dialect
    Masmoudi, Abir
    Bougares, Fethi
    Ellouze, Mariem
    Esteve, Yannick
    Belguith, Lamia
    LANGUAGE RESOURCES AND EVALUATION, 2018, 52 (01) : 249 - 267
  • [46] Recognition quality improvement in Automatic Speech Recognition system for Polish
    Wydra, Sebastian
    EUROCON 2007: THE INTERNATIONAL CONFERENCE ON COMPUTER AS A TOOL, VOLS 1-6, 2007, : 1693 - 1698
  • [47] Hybrid neuromorphic system for automatic speech recognition
    Rafique, M. A.
    Lee, B. G.
    Jeon, M.
    ELECTRONICS LETTERS, 2016, 52 (17) : 1428 - 1429
  • [48] AN APPROACH TO THE AUTOMATIC RECOGNITION OF SPEECH
    PAY, BE
    EVANS, CR
    INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1981, 14 (01): : 13 - 27
  • [49] PROSPECTS FOR AUTOMATIC RECOGNITION OF SPEECH
    HOUDE, R
    AMERICAN ANNALS OF THE DEAF, 1979, 124 (05) : 568 - 572
  • [50] Automatic speech recognition systems
    Catariov, A
    Information Technologies 2004, 2004, 5822 : 83 - 93