THE ROYALFLUSH AUTOMATIC SPEECH DIARIZATION AND RECOGNITION SYSTEM FOR IN-CAR MULTI-CHANNEL AUTOMATIC SPEECH RECOGNITION CHALLENGE

被引：0

作者：

Tian, Jingguang ^{[1
]}

Ye, Shuaishuai ^{[1
]}

Chen, Shunfei ^{[1
]}

Xiang, Yang ^{[1
]}

Yin, Zhaohui ^{[1
]}

Hu, Xinhui ^{[1
]}

Xu, Xinkang ^{[1
]}

机构：

[1] Hithink RoyalFlush AI Res Inst, Hangzhou, Zhejiang, Peoples R China

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024 | 2024年

关键词：

ICMC-ASR; ASDR; TS-VAD; speaker diarization; speech recognition;

D O I：

10.1109/ICASSPW62465.2024.10626136

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents our system submission for the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) Challenge, which focuses on speaker diarization and speech recognition in complex multi-speaker scenarios. To address these challenges, we develop end-to-end speaker diarization models that notably decrease the diarization error rate (DER) by 49.58% compared to the official baseline on the development set. For speech recognition, we utilize self-supervised learning representations to train end-to-end ASR models. By integrating these models, we achieve a character error rate (CER) of 16.93% on the track 1 evaluation set, and a concatenated minimum permutation character error rate (cpCER) of 25.88% on the track 2 evaluation set.

引用

页码：1 / 2

页数：2

共 50 条

[41] Automatic Speech Recognition System on DSP Board
Huang, Guo-Shing
Tian, Zhi-Hao
2016 INTERNATIONAL AUTOMATIC CONTROL CONFERENCE (CACS), 2016, : 224 - 226
[42] Automatic Speech Recognition System Dedicated for Polish
Ziolko, Mariusz
Galka, Jakub
Ziolko, Bartosz
Jadczyk, Tomasz
Skurzok, Dawid
Masior, Mariusz
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 3322 - 3323
[43] Automatic speech recognition system for Tunisian dialect
Abir Masmoudi
Fethi Bougares
Mariem Ellouze
Yannick Estève
Lamia Belguith
Language Resources and Evaluation, 2018, 52 : 249 - 267
[44] Interface of an Automatic Recognition System for Dysarthric Speech
Zaidi, Brahim-Fares
Boudraa, Malika
Selouani, Sid-Ahmed
Addou, Djamel
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (09) : 560 - 564
[45] Automatic speech recognition system for Tunisian dialect
Masmoudi, Abir
Bougares, Fethi
Ellouze, Mariem
Esteve, Yannick
Belguith, Lamia
LANGUAGE RESOURCES AND EVALUATION, 2018, 52 (01) : 249 - 267
[46] Recognition quality improvement in Automatic Speech Recognition system for Polish
Wydra, Sebastian
EUROCON 2007: THE INTERNATIONAL CONFERENCE ON COMPUTER AS A TOOL, VOLS 1-6, 2007, : 1693 - 1698
[47] Hybrid neuromorphic system for automatic speech recognition
Rafique, M. A.
Lee, B. G.
Jeon, M.
ELECTRONICS LETTERS, 2016, 52 (17) : 1428 - 1429
[48] AN APPROACH TO THE AUTOMATIC RECOGNITION OF SPEECH
PAY, BE
EVANS, CR
INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1981, 14 (01): : 13 - 27
[49] PROSPECTS FOR AUTOMATIC RECOGNITION OF SPEECH
HOUDE, R
AMERICAN ANNALS OF THE DEAF, 1979, 124 (05) : 568 - 572
[50] Automatic speech recognition systems
Catariov, A
Information Technologies 2004, 2004, 5822 : 83 - 93

← 1 2 3 4 5 →