Multi-Task Self-Supervised Learning for Disfluency Detection

被引:0
|
作者
Wang, Shaolei [1 ]
Che, Wanxiang [1 ]
Liu, Qi [2 ]
Qin, Pengda [3 ]
Liu, Ting [1 ]
Wang, William Yang [4 ]
机构
[1] Harbin Inst Technol, Ctr Social Comp & Informat Retrieval, Harbin, Heilongjiang, Peoples R China
[2] Univ Oxford, Oxford, England
[3] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
[4] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most existing approaches to disfluency detection heavily rely on human-annotated data, which is expensive to obtain in practice. To tackle the training data bottleneck, we investigate methods for combining multiple self-supervised tasksi.e., supervised tasks where data can be collected without manual labeling. First, we construct large-scale pseudo training data by randomly adding or deleting words from unlabeled news data, and propose two self-supervised pre-training tasks: (i) tagging task to detect the added noisy words. (ii) sentence classification to distinguish original sentences from grammatically-incorrect sentences. We then combine these two tasks to jointly train a network. The pre-trained network is then fine-tuned using human-annotated disfluency detection training data. Experimental results on the commonly used English Switchboard test set show that our approach can achieve competitive performance compared to the previous systems (trained using the full dataset) by using less than 1% (1000 sentences) of the training data. Our method trained on the full dataset significantly outperforms previous methods, reducing the error by 21% on English Switchboard.
引用
收藏
页码:9193 / 9200
页数:8
相关论文
共 50 条
  • [1] Multi-task self-supervised learning for human activity detection
    Saeed, Aaqib
    Ozcelebi, Tanir
    Lukkien, Johan
    Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2019, 3 (02)
  • [2] Multi-task Self-Supervised Visual Learning
    Doersch, Carl
    Zisserman, Andrew
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2070 - 2079
  • [3] Anomaly Detection in Video via Self-Supervised and Multi-Task Learning
    Georgescu, Mariana-Iuliana
    Barbalau, Antonio
    Ionescu, Radu Tudor
    Khan, Fahad Shahbaz
    Popescu, Marius
    Shah, Mubarak
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12737 - 12747
  • [4] Multi-task Semantic Matching with Self-supervised Learning
    Chen Y.
    Qiu X.
    Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis, 2022, 58 (01): : 83 - 90
  • [5] Multi-task Self-Supervised Adaptation for Reinforcement Learning
    Wu, Keyu
    Chen, Zhenghua
    Wu, Min
    Xiang, Shili
    Jin, Ruibing
    Zhang, Le
    Li, Xiaoli
    2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 15 - 20
  • [6] Self-supervised multi-task learning for medical image analysis
    Yu, Huihui
    Dai, Qun
    PATTERN RECOGNITION, 2024, 150
  • [7] MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION
    Ravanelli, Mirco
    Zhong, Jianyuan
    Pascual, Santiago
    Swietojanski, Pawel
    Monteiro, Joao
    Trmal, Jan
    Bengio, Yoshua
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6989 - 6993
  • [8] Multi-Task Self-Supervised Learning for Script Event Prediction
    Zhou, Bo
    Chen, Yubo
    Liu, Kang
    Zhao, Jun
    Xu, Jiexin
    Jiang, Xiaojian
    Li, Jinlong
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3662 - 3666
  • [9] A MULTI-TASK SELF-SUPERVISED LEARNING FRAMEWORK FOR SCOPY IMAGES
    Li, Yuexiang
    Chen, Jiawei
    Zheng, Yefeng
    2020 IEEE 17TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2020), 2020, : 2005 - 2009
  • [10] Multi-task Self-supervised Few-Shot Detection
    Zhang, Guangyong
    Duan, Lijuan
    Wang, Wenjian
    Gong, Zhi
    Ma, Bian
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XII, 2024, 14436 : 107 - 119