Gradual Syntactic Label Replacement for Language Model Pre-Training

被引：0

作者：

Wang, Yile ^{[1
]}

Zhang, Yue ^{[2
]}

Li, Peng ^{[1
]}

Liu, Yang ^{[3
]}

机构：

[1] Tsinghua Univ, Inst AI Ind Res, Beijing 100084, Peoples R China

[2] Westlake Univ, Sch Engn, Hangzhou 310024, Peoples R China

[3] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2024年 / 32卷

基金：

中国国家自然科学基金;

关键词：

Language model pre-training; syntactic label replacement; curriculum learning; data-centric;

D O I：

10.1109/TASLP.2023.3331096

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Pre-training serves as a foundation of recent NLP models, where language modeling tasks are performed over large texts. Typical models like BERT and GPT take the corpus as a whole and treat each word equally for language modeling. However, recent works show that the naturally existing frequency bias in the raw corpus may limit the power of the language model. In this article, we propose a multi-stage training strategy that gradually increases the training vocabulary by modifying the training data. Specifically, we leverage the syntactic structure as a bridge for infrequent words and replace them with the corresponding syntactic labels, then we recover their original lexical surface for further training. Such strategy results in an easy-to-hard curriculum learning process, where the model learns the most common words and some basic syntax concepts, before recognizing a large number of uncommon words via their specific usages and the previously learned category knowledge. Experimental results show that such a method can improve the performance of both discriminative and generative pre-trained language models on benchmarks and various downstream tasks.

引用

页码：486 / 496

页数：11

共 50 条

[1] FlauBERT: Unsupervised Language Model Pre-training for French
Le, Hang
Vial, Loic
Frej, Jibril
Segonne, Vincent
Coavoux, Maximin
Lecouteux, Benjamin
Allauzen, Alexandre
Crabbe, Benoit
Besacier, Laurent
Schwab, Didier
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2479 - 2490
[2] Soft Language Clustering for Multilingual Model Pre-training
Zeng, Jiali
Jiang, Yufan
Yin, Yongjing
Jing, Yi
Meng, Fandong
Lin, Binghuai
Cao, Yunbo
Zhou, Jie
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 7021 - 7035
[3] Multi-label Patent Classification with Pre-training Model
Xinyu T.
Ruijie Z.
Yonghe L.
Data Analysis and Knowledge Discovery, 2022, 6 (2-3) : 129 - 137
[4] Unified Language Model Pre-training for Natural Language Understanding and Generation
Dong, Li
Yang, Nan
Wang, Wenhui
Wei, Furu
Liu, Xiaodong
Wang, Yu
Gao, Jianfeng
Zhou, Ming
Hon, Hsiao-Wuen
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[5] Simultaneously Training and Compressing Vision-and-Language Pre-Training Model
Qi, Qiaosong
Zhang, Aixi
Liao, Yue
Sun, Wenyu
Wang, Yongliang
Li, Xiaobo
Liu, Si
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8194 - 8203
[6] Conditional Embedding Pre-Training Language Model for Image Captioning
Li, Pengfei
Zhang, Min
Lin, Peijie
Wan, Jian
Jiang, Ming
NEURAL PROCESSING LETTERS, 2022, 54 (06) : 4987 - 5003
[7] oLMpics-On What Language Model Pre-training Captures
Talmor, Alon
Elazar, Yanai
Goldberg, Yoav
Berant, Jonathan
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 (08) : 743 - 758
[8] Pre-training A Prompt Pool for Vision-Language Model
Liu, Jun
Gu, Yang
Yang, Zhaohua
Guo, Shuai
Liu, Huaqiu
Chen, Yiqiang
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[9] REALM: Retrieval-Augmented Language Model Pre-Training
Guu, Kelvin
Lee, Kenton
Tung, Zora
Pasupat, Panupong
Chang, Ming-Wei
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[10] Conditional Embedding Pre-Training Language Model for Image Captioning
Pengfei Li
Min Zhang
Peijie Lin
Jian Wan
Ming Jiang
Neural Processing Letters, 2022, 54 : 4987 - 5003

← 1 2 3 4 5 →