Data Augmentation for Text Generation Without Any Augmented Data

被引:0
|
作者
Bi, Wei [1 ]
Li, Huayang [1 ]
Huang, Jiacheng [1 ]
机构
[1] Tencent AI Lab, Shenzhen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data augmentation is an effective way to improve the performance of many neural text generation models. However, current data augmentation methods need to define or choose proper data mapping functions that map the original samples into the augmented samples. In this work, we derive an objective to formulate the problem of data augmentation on text generation tasks without any use of augmented data constructed by specific mapping functions. Our proposed objective can be efficiently optimized and applied to popular loss functions on text generation tasks with a convergence rate guarantee. Experiments on five datasets of two text generation tasks show that our approach can approximate or even surpass popular data augmentation methods.
引用
收藏
页码:2223 / 2237
页数:15
相关论文
共 50 条
  • [1] Neural Data-to-Text Generation with LM-based Text Augmentation
    Chang, Ernie
    Shen, Xiaoyu
    Zhu, Dawei
    Demberg, Vera
    Su, Hui
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 758 - 768
  • [2] Exploring Data Augmentation in Neural DRS-to-Text Generation
    Amin, Muhammad Saad
    Anselma, Luca
    Mazzei, Alessandro
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 2164 - 2178
  • [3] Avoiding Overlap in Data Augmentation for AMR-to-Text Generation
    Du, Wenchao
    Flanigan, Jeffrey
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 1043 - 1048
  • [4] Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation
    Liu, Ruibo
    Xu, Guangxuan
    Jia, Chenyan
    Ma, Weicheng
    Wang, Lili
    Vosoughi, Soroush
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9031 - 9041
  • [5] Toward a Better Text Data Augmentation via Filtering and Transforming Augmented Instances
    Xia, Fei
    He, Shizhu
    Liu, Kang
    Liu, Shengping
    Zhao, Jun
    KNOWLEDGE GRAPH AND SEMANTIC COMPUTING: KNOWLEDGE GRAPH EMPOWERS NEW INFRASTRUCTURE CONSTRUCTION, 2021, 1466 : 198 - 210
  • [6] Controllable Meaning Representation to Text Generation: Linearization and Data Augmentation Strategies
    Kedzie, Chris
    McKeown, Kathleen
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 5160 - 5185
  • [7] Improving DRS-to-Text Generation Through Delexicalization and Data Augmentation
    Amin, Muhammad Saad
    Anselma, Luca
    Mazzei, Alessandro
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT I, NLDB 2024, 2024, 14762 : 121 - 136
  • [8] Unsupervised Data Augmentation with Naive Augmentation and without Unlabeled Data
    Lowell, David
    Howard, Brian E.
    Lipton, Zachary C.
    Wallace, Byron C.
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 4992 - 5001
  • [9] Text Data Augmentation for Deep Learning
    Connor Shorten
    Taghi M. Khoshgoftaar
    Borko Furht
    Journal of Big Data, 8
  • [10] Text Data Augmentation for Deep Learning
    Shorten, Connor
    Khoshgoftaar, Taghi M.
    Furht, Borko
    JOURNAL OF BIG DATA, 2021, 8 (01)