A Primer on Seq2Seq Models for Generative Chatbots

被引:2
|
作者
Scotti, Vincenzo [1 ]
Sbattella, Licia [1 ]
Tedesco, Roberto [1 ]
机构
[1] Politecn Milan, DEIB, Via Golgi 42, I-20133 Milan, MI, Italy
关键词
Natural Language Processing; Seq2Seq; LanguageModel; generative chatbot; Open-domain dialogue;
D O I
10.1145/3604281
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The recent spread of Deep Learning-based solutions for Artificial Intelligence and the development of Large Language Models has pushed forwards significantly the Natural Language Processing area. The approach has quickly evolved in the last ten years, deeply affecting NLP, from low-level text pre-processing tasks such as tokenisation or POS tagging- to high-level, complex NLP applications like machine translation and chatbots. This article examines recent trends in the development of open-domain data-driven generative chatbots, focusing on the Seq2Seq architectures. Such architectures are compatible with multiple learning approaches, ranging from supervised to reinforcement and, in the last years, allowed to realise very engaging open-domain chatbots. Not only do these architectures allowto directly output the next turn in a conversation but, to some extent, they also allow to control the style or content of the response. To offer a complete view on the subject, we examine possible architecture implementations as well as training and evaluation approaches. Additionally, we provide information about the openly available corpora to train and evaluate such models and about the current and past chatbot competitions. Finally, we present some insights on possible future directions, given the current research status.
引用
收藏
页数:58
相关论文
共 50 条
  • [41] Tool Wear Monitoring System Using Seq2Seq
    Jeon, Wang-Su
    Rhee, Sang-Yong
    MACHINES, 2024, 12 (03)
  • [42] Adversarial Oracular Seq2seq Learning for Sequential Recommendation
    Zhao, Pengyu
    Shui, Tianxiao
    Zhang, Yuanxing
    Xiao, Kecheng
    Bian, Kaigui
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 1905 - 1911
  • [43] A Method for Estimating Process Maliciousness with Seq2Seq Model
    Tobiyama, Shun
    Yamaguchi, Yukiko
    Hasegawa, Hirokazu
    Shimada, Hajime
    Akiyama, Mitsuaki
    Yagi, Takeshi
    2018 32ND INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN), 2018, : 255 - 260
  • [44] Keyphrase Generation Based on Deep Seq2seq Model
    Zhang, Yong
    Xiao, Weidong
    IEEE ACCESS, 2018, 6 : 46047 - 46057
  • [45] Abstract Text Summarization with a Convolutional Seq2seq Model
    Zhang, Yong
    Li, Dan
    Wang, Yuheng
    Fang, Yang
    Xiao, Weidong
    APPLIED SCIENCES-BASEL, 2019, 9 (08):
  • [46] Seq2seq is All You Need for Coreference Resolution
    Zhang, Wenzheng
    Wiseman, Sam
    Stratos, Karl
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 11493 - 11504
  • [47] SEQ2SEQ++: A Multitasking-Based Seq2seq Model to Generate Meaningful and Relevant Answers
    Palasundram, Kulothunkan
    Sharef, Nurfadhlina Mohd
    Kasmiran, Khairul Azhar
    Azman, Azreen
    IEEE ACCESS, 2021, 9 (09): : 164949 - 164975
  • [48] GSSF: A Generative Sequence Similarity Function Based on a Seq2Seq Model for Clustering Online Handwritten Mathematical Answers
    Huy Quang Ung
    Cuong Tuan Nguyen
    Hung Tuan Nguyen
    Nakagawa, Masaki
    DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 145 - 159
  • [49] Manufacturing Quality Management Based on TimeGAN and Seq2Seq Models With Magnetic Press Machine Data
    Jong Hyuk Lee
    Min Young Kim
    International Journal of Control, Automation and Systems, 2025, 23 (4) : 1199 - 1209
  • [50] Falls Prediction Based on Body Keypoints and Seq2Seq Architecture
    Hua, Minjie
    Nan, Yibing
    Lian, Shiguo
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1251 - 1259