KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text Generation

被引:0
|
作者
Feng, Yuxi [1 ]
Yi, Xiaoyuan [2 ]
Lakshmanan, Laks V. S. [1 ]
Xie, Xing [2 ]
机构
[1] Univ British Columbia, Vancouver, BC, Canada
[2] Microsoft Res Asia, Beijing, Peoples R China
来源
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023 | 2023年
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-training (ST) has come to fruition in language understanding tasks by producing pseudo labels, which reduces the labeling bottleneck of language model fine-tuning. Nevertheless, in facilitating semi-supervised controllable language generation, ST faces two key challenges. First, augmented by self-generated pseudo text, generation models tend to over-exploit the previously learned text distribution, suffering from mode collapse and poor generation diversity. Second, generating pseudo text in each iteration is time-consuming, severely decelerating the training process. In this work, we propose KEST, a novel and efficient self-training framework to handle these problems. KEST utilizes a kernel-based loss, rather than standard cross entropy, to learn from the soft pseudo text produced by a shared non-autoregressive generator. We demonstrate both theoretically and empirically that KEST can benefit from more diverse pseudo text in an efficient manner, which allows not only refining and exploiting the previously fitted distribution but also enhanced exploration towards a larger potential text space, providing a guarantee of improved performance. Experiments on three controllable generation tasks demonstrate that KEST significantly improves control accuracy while maintaining comparable text fluency and generation diversity against several strong baselines.
引用
收藏
页码:5049 / 5057
页数:9
相关论文
共 50 条
  • [21] Structure-to-Text Generation with Self-Training, Acceptability Classifiers and Context-Conditioning for the GEM Shared Task
    Bakshi, Shreyan
    Batra, Soumya
    Heidari, Peyman
    Arun, Ankit
    Jain, Shashank
    White, Michael
    1ST WORKSHOP ON NATURAL LANGUAGE GENERATION, EVALUATION, AND METRICS (GEM 2021), 2021, : 136 - 147
  • [22] Self-training with partial labeling for multi-label text classification
    Ren J.
    Zhu T.
    Chen W.
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2024, 64 (04): : 679 - 687
  • [23] Costra: Confidence-based self-training
    Cheng, Shengjun
    Huang, Qingcheng
    Liu, Jiafeng
    Tang, Xianglong
    Journal of Computational Information Systems, 2013, 9 (24): : 9761 - 9769
  • [24] FEDERATED SELF-TRAINING FOR DATA-EFFICIENT AUDIO RECOGNITION
    Tsouvalas, Vasileios
    Saeed, Aaqib
    Ozcelebi, Tanir
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 476 - 480
  • [25] Improving Graph Neural Networks by combining active learning with self-training
    Katsimpras, Georgios
    Paliouras, Georgios
    DATA MINING AND KNOWLEDGE DISCOVERY, 2024, 38 (01) : 110 - 127
  • [26] Improving semi-supervised self-training with embedded manifold transduction
    Tao, Ye
    Zhang, Duzhou
    Cheng, Shengjun
    Tang, Xianglong
    TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL, 2018, 40 (02) : 363 - 374
  • [27] Improving Graph Neural Networks by combining active learning with self-training
    Georgios Katsimpras
    Georgios Paliouras
    Data Mining and Knowledge Discovery, 2024, 38 : 110 - 127
  • [28] Generation-driven Contrastive Self-training for Zero-shot Text Classification with Instruction-following LLM
    Zhang, Ruohong
    Wang, Yau-Shian
    Yang, Yiming
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 659 - 673
  • [29] Category-aware self-training for extremely weakly supervised text classification
    Su, Jing
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 269
  • [30] Development of a web-based self-training package for information retrieval using the distance education approach
    Sacchanand, Chutima
    Jaroenpuntaruk, Vipa
    ELECTRONIC LIBRARY, 2006, 24 (04): : 501 - 516