Improving Pretrained Language Model Fine-Tuning With Noise Stability Regularization

被引：2

作者：

Hua, Hang ^{[1
]}

Li, Xingjian ^{[2
]}

Dou, Dejing ^{[3
]}

Xu, Cheng-Zhong ^{[4
]}

Luo, Jiebo ^{[1
]}

机构：

[1] Univ Rochester, Dept Comp Sci, Rochester, NY 14627 USA

[2] Carnegie Mellon Univ, Computat Biol Dept, Pittsburgh, PA 15213 USA

[3] BCG Greater China, Beijing 100027, Peoples R China

[4] Univ Macau, State Key Lab IOTSC, Fac Sci & Technol, Macau, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2025年 / 36卷 / 01期

关键词：

Stability analysis; Task analysis; Training; Transformers; Gaussian distribution; Standards; Optimization; Domain generalization; fine-tuning; in-domain generalization; pretrained language models (PLMs); regularization; NEURAL-NETWORKS;

D O I：

10.1109/TNNLS.2023.3330926

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The advent of large-scale pretrained language models (PLMs) has contributed greatly to the progress in natural language processing (NLP). Despite its recent success and wide adoption, fine-tuning a PLM often suffers from overfitting, which leads to poor generalizability due to the extremely high complexity of the model and the limited training samples from downstream tasks. To address this problem, we propose a novel and effective fine-tuning framework, named layerwise noise stability regularization (LNSR). Specifically, our method perturbs the input of neural networks with the standard Gaussian or in-manifold noise in the representation space and regularizes each layer's output of the language model. We provide theoretical and experimental analyses to prove the effectiveness of our method. The empirical results show that our proposed method outperforms several state-of-the-art algorithms, such as L2 norm and start point (L2-SP), Mixout, FreeLB, and smoothness inducing adversarial regularization and Bregman proximal point optimization (SMART). In addition to evaluating the proposed method on relatively simple text classification tasks, similar to the prior works, we further evaluate the effectiveness of our method on more challenging question-answering (QA) tasks. These tasks present a higher level of difficulty, and they provide a larger amount of training examples for tuning a well-generalized model. Furthermore, the empirical results indicate that our proposed method can improve the ability of language models to domain generalization.

引用

页码：1898 / 1910

页数：13

共 50 条

[31] DN at SemEval-2023 Task 12: Low-Resource Language Text Classification via Multilingual Pretrained Language Model Fine-tuning
Daniil, Homskiy
Narek, Maloyan
17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 1537 - 1541
[32] Multi-phase Fine-Tuning: A New Fine-Tuning Approach for Sign Language Recognition
Sarhan, Noha
Lauri, Mikko
Frintrop, Simone
KUNSTLICHE INTELLIGENZ, 2022, 36 (01): : 91 - 98
[33] Multi-phase Fine-Tuning: A New Fine-Tuning Approach for Sign Language Recognition
Noha Sarhan
Mikko Lauri
Simone Frintrop
KI - Künstliche Intelligenz, 2022, 36 : 91 - 98
[34] ACTUNE: Uncertainty-Based Active Self-Training for Active Fine-Tuning of Pretrained Language Models
Yu, Yue
Kong, Lingkai
Zhang, Jieyu
Zhang, Rongzhi
Zhang, Chao
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1422 - 1436
[35] Causal-Debias: Unifying Debiasing in Pretrained Language Models and Fine-tuning via Causal Invariant Learning
Zhou, Fan
Mao, Yuzhou
Yu, Liu
Yang, Yi
Zhong, Ting
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4227 - 4241
[36] Leveraging Pretrained Language Models for Enhanced Entity Matching: A Comprehensive Study of Fine-Tuning and Prompt Learning Paradigms
Wang, Yu
Zhou, Luyao
Wang, Yuan
Peng, Zhenwan
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2024, 2024
[37] Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model Fine-tuning
Zhang, Zhen-Ru
Tan, Chuanqi
Xu, Haiyang
Wang, Chengyu
Huang, Jun
Huang, Songfang
61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1239 - 1248
[38] MultiFiT: Efficient Multi-lingual Language Model Fine-tuning
Eisenschlos, Julian
Ruder, Sebastian
Czapla, Piotr
Kardas, Marcin
Gugger, Sylvain
Howard, Jeremy
2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5702 - 5707
[39] Improving fine-tuning in composite Higgs models
Banerjee, Avik
Bhattacharyya, Gautam
Ray, Tirtha Sankar
PHYSICAL REVIEW D, 2017, 96 (03)
[40] MediBioDeBERTa: Biomedical Language Model With Continuous Learning and Intermediate Fine-Tuning
Kim, Eunhui
Jeong, Yuna
Choi, Myung-Seok
IEEE ACCESS, 2023, 11 : 141036 - 141044

← 1 2 3 4 5 →