Neural Pipeline for Zero-Shot Data-to-Text Generation

被引:0
|
作者
Kasner, Zdenek [1 ]
Dusek, Ondrej [1 ]
机构
[1] Charles Univ Prague, Fac Math & Phys, Inst Formal & Appl Linguist, Prague, Czech Republic
基金
欧洲研究理事会;
关键词
NATURAL-LANGUAGE GENERATION; OF-THE-ART;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In data-to-text (D2T) generation, training on in-domain data leads to overfitting to the data representation and repeating training data noise. We examine how to avoid finetuning pretrained language models (PLMs) on D2T generation datasets while still taking advantage of surface realization capabilities of PLMs. Inspired by pipeline approaches, we propose to generate text by transforming single-item descriptions with a sequence of modules trained on general-domain text-based operations: ordering, aggregation, and paragraph compression. We train PLMs for performing these operations on a synthetic corpus WIKIFLUENT which we build from English Wikipedia. Our experiments on two major triple-to-text datasets-WebNLG and E2E-show that our approach enables D2T generation from RDF triples in zero-shot settings.(1)
引用
收藏
页码:3914 / 3932
页数:19
相关论文
共 50 条
  • [1] Neural Methods for Data-to-text Generation
    Sharma, Mandar
    Gogineni, Ajay kumar
    Ramakrishnan, Naren
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (05)
  • [2] Zero-Shot Text-to-Image Generation
    Ramesh, Aditya
    Pavlov, Mikhail
    Goh, Gabriel
    Gray, Scott
    Voss, Chelsea
    Radford, Alec
    Chen, Mark
    Sutskever, Ilya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [3] A Survey on Neural Data-to-Text Generation
    Lin, Yupian
    Ruan, Tong
    Liu, Jingping
    Wang, Haofen
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1431 - 1449
  • [4] Neural data-to-text generation: A comparison between pipeline and end-to-end architectures
    Ferreira, Thiago Castro
    van der Lee, Chris
    van Miltenburg, Emiel
    Krahmer, Emiel
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 552 - 562
  • [5] Neural data-to-text generation with dynamic content planning
    Chen, Kai
    Li, Fayuan
    Hu, Baotian
    Peng, Weihua
    Chen, Qingcai
    Yu, Hong
    Xiang, Yang
    KNOWLEDGE-BASED SYSTEMS, 2021, 215
  • [6] PAN: Pipeline assisted neural networks model for data-to-text generation in social internet of things
    Jiang, Nan
    Chen, Jing
    Zhou, Ri-Gui
    Wu, Changxing
    Chen, Honglong
    Zheng, Jiaqi
    Wan, Tao
    INFORMATION SCIENCES, 2020, 530 (530) : 167 - 179
  • [7] Neural Data-to-Text Generation Guided by Predicted Plan
    Gao, Hanning
    Wei, Zhihua
    2022 IEEE 2ND INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SOFTWARE ENGINEERING (ICICSE 2022), 2022, : 53 - 59
  • [8] Neural Data-to-Text Generation with LM-based Text Augmentation
    Chang, Ernie
    Shen, Xiaoyu
    Zhu, Dawei
    Demberg, Vera
    Su, Hui
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 758 - 768
  • [9] Zero-Shot Turkish Text Classification
    Birim, Ahmet
    Erden, Mustafa
    Arslan, Levent M.
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [10] REGEN: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval
    Yu, Yue
    Zhuang, Yuchen
    Zhang, Rongzhi
    Meng, Yu
    Shen, Jiaming
    Zhang, Chao
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 11782 - 11805