A Primer on Seq2Seq Models for Generative Chatbots

被引:2
|
作者
Scotti, Vincenzo [1 ]
Sbattella, Licia [1 ]
Tedesco, Roberto [1 ]
机构
[1] Politecn Milan, DEIB, Via Golgi 42, I-20133 Milan, MI, Italy
关键词
Natural Language Processing; Seq2Seq; LanguageModel; generative chatbot; Open-domain dialogue;
D O I
10.1145/3604281
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The recent spread of Deep Learning-based solutions for Artificial Intelligence and the development of Large Language Models has pushed forwards significantly the Natural Language Processing area. The approach has quickly evolved in the last ten years, deeply affecting NLP, from low-level text pre-processing tasks such as tokenisation or POS tagging- to high-level, complex NLP applications like machine translation and chatbots. This article examines recent trends in the development of open-domain data-driven generative chatbots, focusing on the Seq2Seq architectures. Such architectures are compatible with multiple learning approaches, ranging from supervised to reinforcement and, in the last years, allowed to realise very engaging open-domain chatbots. Not only do these architectures allowto directly output the next turn in a conversation but, to some extent, they also allow to control the style or content of the response. To offer a complete view on the subject, we examine possible architecture implementations as well as training and evaluation approaches. Additionally, we provide information about the openly available corpora to train and evaluate such models and about the current and past chatbot competitions. Finally, we present some insights on possible future directions, given the current research status.
引用
收藏
页数:58
相关论文
共 50 条
  • [21] Mongolian Word Segmentation Based on Three Character Level Seq2Seq Models
    Liu, Na
    Su, Xiangdong
    Gao, Guanglai
    Bao, Feilong
    NEURAL INFORMATION PROCESSING (ICONIP 2018), PT V, 2018, 11305 : 558 - 569
  • [22] Neural Question Generation based on Seq2Seq
    Liu, Bingran
    2020 5TH INTERNATIONAL CONFERENCE ON MATHEMATICS AND ARTIFICIAL INTELLIGENCE (ICMAI 2020), 2020, : 119 - 123
  • [23] Minimize Exposure Bias of Seq2Seq Models in Joint Entity and Relation Extraction
    Zhang, Ranran Haoran
    Liu, Qianying
    Fan, Aysa Xuemo
    Ji, Heng
    Zeng, Daojian
    Cheng, Fei
    Kawahara, Daisuke
    Kurohashi, Sadao
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 236 - 246
  • [24] SGDG: Improving Transformer Seq2Seq Models through Span Generation and Denoise Generation
    Yang, Zhenfei
    Yu, Beiming
    Dou, Chenxiao
    Zhang, Qian
    Chua, Yansong
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2024, PT 2, 2025, 14851 : 486 - 495
  • [25] Calibrated Seq2seq Models for Efficient and Generalizable Ultra-fine Entity Typing
    Feng, Yanlin
    Pratapa, Adithya
    Mortensen, David
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 15550 - 15560
  • [26] Automatic Conversational Helpdesk Solution using Seq2Seq and Slot-filling Models
    Patidar, Mayur
    Agarwal, Puneet
    Vig, Lovekesh
    Shroff, Gautam
    CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 1967 - 1975
  • [27] Residual Seq2Seq model for Building energy management
    Kim, Marie
    Kim, Nae-soo
    Song, YuJin
    Pyo, Cheol Sig
    2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC): ICT CONVERGENCE LEADING THE AUTONOMOUS FUTURE, 2019, : 1126 - 1128
  • [28] Automatic Generation of Pseudocode with Attention Seq2seq Model
    Xu, Shaofeng
    Xiong, Yun
    2018 25TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2018), 2018, : 711 - 712
  • [29] Map Matching Based on Seq2Seq with Topology Information
    Bai, Yulong
    Li, Guolian
    Lu, Tianxiu
    Wu, Yadong
    Zhang, Weihan
    Feng, Yidan
    APPLIED SCIENCES-BASEL, 2023, 13 (23):
  • [30] A Study on Hierarchical Text Classification as a Seq2seq Task
    Torba, Fatos
    Gravier, Christophe
    Laclau, Charlotte
    Kammoun, Abderrhammen
    Subercaze, Julien
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT III, 2024, 14610 : 287 - 296