KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text Generation

被引:0
|
作者
Feng, Yuxi [1 ]
Yi, Xiaoyuan [2 ]
Lakshmanan, Laks V. S. [1 ]
Xie, Xing [2 ]
机构
[1] Univ British Columbia, Vancouver, BC, Canada
[2] Microsoft Res Asia, Beijing, Peoples R China
来源
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023 | 2023年
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-training (ST) has come to fruition in language understanding tasks by producing pseudo labels, which reduces the labeling bottleneck of language model fine-tuning. Nevertheless, in facilitating semi-supervised controllable language generation, ST faces two key challenges. First, augmented by self-generated pseudo text, generation models tend to over-exploit the previously learned text distribution, suffering from mode collapse and poor generation diversity. Second, generating pseudo text in each iteration is time-consuming, severely decelerating the training process. In this work, we propose KEST, a novel and efficient self-training framework to handle these problems. KEST utilizes a kernel-based loss, rather than standard cross entropy, to learn from the soft pseudo text produced by a shared non-autoregressive generator. We demonstrate both theoretically and empirically that KEST can benefit from more diverse pseudo text in an efficient manner, which allows not only refining and exploiting the previously fitted distribution but also enhanced exploration towards a larger potential text space, providing a guarantee of improved performance. Experiments on three controllable generation tasks demonstrate that KEST significantly improves control accuracy while maintaining comparable text fluency and generation diversity against several strong baselines.
引用
收藏
页码:5049 / 5057
页数:9
相关论文
共 50 条
  • [1] Unsupervised Controllable Generation with Self-Training
    Chrysos, Grigorios G.
    Kossaifi, Jean
    Yu, Zhiding
    Anandkumar, Anima
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [2] Improving Compositional Generalization with Self-Training for Data-to-Text Generation
    Mehta, Sanket Vaibhav
    Rao, Jinfeng
    Tay, Yi
    Kale, Mihir
    Parikh, Ankur P.
    Strubell, Emma
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4205 - 4219
  • [3] Improving Semantic Segmentation via Efficient Self-Training
    Zhu, Yi
    Zhang, Zhongyue
    Wu, Chongruo
    Zhang, Zhi
    He, Tong
    Zhang, Hang
    Manmatha, R.
    Li, Mu
    Smola, Alexander
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (03) : 1589 - 1602
  • [4] Text Classification Based on Transfer Learning and Self-Training
    Zheng, Yabin
    Teng, Shaohua
    Liu, Zhiyuan
    Sun, Maosong
    ICNC 2008: FOURTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 3, PROCEEDINGS, 2008, : 363 - 367
  • [5] Self-training for Handwritten Text Line Recognition
    Frinken, Volkmar
    Bunke, Horst
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, 2010, 6419 : 104 - 112
  • [6] A Self-Training Approach for Short Text Clustering
    Hadifar, Amir
    Sterckx, Lucas
    Demeester, Thomas
    Develder, Chris
    4TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2019), 2019, : 194 - 199
  • [7] Text classification method based on self-training and LDA topic models
    Pavlinek, Miha
    Podgorelec, Vili
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 80 : 83 - 93
  • [8] Improving Skin Lesion Segmentation with Self-Training
    Dzieniszewska, Aleksandra
    Garbat, Piotr
    Piramidowicz, Ryszard
    CANCERS, 2024, 16 (06)
  • [9] XFBoost: Improving Text Generation with Controllable Decoders
    Peng, Xiangyu
    Sollami, Michael
    arXiv, 2022,
  • [10] Self-Training for Domain Adaptive Scene Text Detection
    Chen, Yudi
    Wang, Wei
    Zhou, Yu
    Yang, Fei
    Yang, Dongbao
    Wang, Weiping
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 850 - 857