TeleSpeechPT: Large-Scale Chinese Multi-dialect and Multi-accent Speech Pre-training

被引:0
|
作者
Chen, Hongjie [1 ]
Li, Zehan [1 ]
Xia, Guangmin [1 ]
Liu, Boqing [1 ]
Yang, Yan [1 ]
Kang, Jian [1 ]
Li, Jie [1 ]
机构
[1] China Telecom, Inst Artificial Intelligence TeleAI, Beijing, Peoples R China
关键词
Speech Pre-training; Accented ASR; Dialectal ASR;
D O I
10.1007/978-981-96-1045-7_15
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We train Data2Vec2 models of various parameter scales on 300,000 h of unannotated Chinese multi-dialect and multi-accent speech data. These models are validated on multiple speech recognition datasets through fine-tuning and by utilizing them as feature extractors for CTC-based automatic speech recognition tasks. We are releasing these models to the open-source community to facilitate the research and application of speech processing technologies that support Chinese dialects and accents.
引用
收藏
页码:183 / 190
页数:8
相关论文
共 50 条
  • [41] Robust feature learning for online discriminative tracking without large-scale pre-training
    Jun Zhang
    Bineng Zhong
    Pengfei Wang
    Cheng Wang
    Jixiang Du
    Frontiers of Computer Science, 2018, 12 : 1160 - 1172
  • [42] XCODE: Towards Cross-Language Code Representation with Large-Scale Pre-Training
    Lin, Zehao
    Li, Guodun
    Zhang, Jingfeng
    Deng, Yue
    Zeng, Xiangji
    Zhang, Yin
    Wan, Yao
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2022, 31 (03)
  • [43] CCSUMSP: A cross-subject Chinese speech decoding framework with unified topology and multi-modal semantic pre-training
    Huang, Shuai
    Wang, Yongxiong
    Luo, Huan
    INFORMATION FUSION, 2025, 119
  • [44] COMAVE: Contrastive Pre-training with Multi-scale Masking for Attribute Value Extraction
    Guo, Xinnan
    Deng, Wentao
    Chen, Yongrui
    Li, Yang
    Zhou, Mengdi
    Qi, Guilin
    Wu, Tianxing
    Dong, Yang
    Wang, Liubin
    Pan, Yong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 6007 - 6018
  • [45] TaiSu: A 166M Large-scale High-Quality Dataset for Chinese Vision-Language Pre-training
    Liu, Yulong
    Zhu, Guibo
    Zhu, Bin
    Song, Qi
    Ge, Guojing
    Chen, Haoran
    Qiao, Guanhui
    Peng, Ru
    Wu, Lingxiang
    Wang, Jinqiao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [46] Adapting Large-Scale Pre-trained Models for Uni ed Dialect Speech Recognition Model
    Toyama, T.
    Kai, A.
    Kamiya, Y.
    Takahashi, N.
    Acta Physica Polonica A, 2024, 146 (04) : 413 - 418
  • [47] WSPAlign: Word Alignment Pre-training via Large-Scale Weakly Supervised Span Prediction
    Wu, Qiyu
    Nagata, Masaaki
    Tsuruoka, Yoshimasa
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 11084 - 11099
  • [48] Editorial for Special Issue on Large-scale Pre-training: Data, Models, and Fine-tuning
    Wen, Ji-Rong
    Huang, Zi
    Zhang, Hanwang
    MACHINE INTELLIGENCE RESEARCH, 2023, 20 (02) : 145 - 146
  • [49] Learning meaningful representation of single-neuron morphology via large-scale pre-training
    Fan, Yimin
    Li, Yaxuan
    Zhong, Yunhua
    Hong, Liang
    Li, Lei
    Li, Yu
    BIOINFORMATICS, 2024, 40 : ii128 - ii136
  • [50] A Comparison between Pre-training and Large-scale Back-translation for Neural Machine Translation
    Huang, Dandan
    Wang, Kun
    Zhang, Yue
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1718 - 1732