A Primer on Seq2Seq Models for Generative Chatbots

被引:2
|
作者
Scotti, Vincenzo [1 ]
Sbattella, Licia [1 ]
Tedesco, Roberto [1 ]
机构
[1] Politecn Milan, DEIB, Via Golgi 42, I-20133 Milan, MI, Italy
关键词
Natural Language Processing; Seq2Seq; LanguageModel; generative chatbot; Open-domain dialogue;
D O I
10.1145/3604281
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The recent spread of Deep Learning-based solutions for Artificial Intelligence and the development of Large Language Models has pushed forwards significantly the Natural Language Processing area. The approach has quickly evolved in the last ten years, deeply affecting NLP, from low-level text pre-processing tasks such as tokenisation or POS tagging- to high-level, complex NLP applications like machine translation and chatbots. This article examines recent trends in the development of open-domain data-driven generative chatbots, focusing on the Seq2Seq architectures. Such architectures are compatible with multiple learning approaches, ranging from supervised to reinforcement and, in the last years, allowed to realise very engaging open-domain chatbots. Not only do these architectures allowto directly output the next turn in a conversation but, to some extent, they also allow to control the style or content of the response. To offer a complete view on the subject, we examine possible architecture implementations as well as training and evaluation approaches. Additionally, we provide information about the openly available corpora to train and evaluate such models and about the current and past chatbot competitions. Finally, we present some insights on possible future directions, given the current research status.
引用
收藏
页数:58
相关论文
共 50 条
  • [1] Sparsing and Smoothing for the seq2seq Models
    Zhao S.
    Liang Z.
    Wen J.
    Chen J.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (03): : 464 - 472
  • [2] Application of Seq2Seq Models on Code Correction
    Huang, Shan
    Zhou, Xiao
    Chin, Sang
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2021, 4
  • [3] Linguistic Descriptions of Human Motion with Generative Adversarial Seq2Seq Learning
    Goutsu, Yusuke
    Inamura, Tetsunari
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4281 - 4287
  • [4] Seq2Seq models for recommending short text conversations
    Torres, Johnny
    Vaca, Carmen
    Teran, Luis
    Abad, Cristina L.
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 150
  • [5] Seq2Seq Deep Learning Models for Microtext Normalization
    Satapathy, Ranjan
    Li, Yang
    Cavallari, Sandro
    Cambria, Erik
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [6] Learning Transductions and Alignments with RNN Seq2seq Models
    Wang, Zhengxiang
    INTERNATIONAL CONFERENCE ON GRAMMATICAL INFERENCE, VOL 217, 2023, 217 : 223 - 249
  • [7] Cold Fusion: Training Seq2Seq Models Together with Language Models
    Sriram, Anuroop
    Jun, Heewoo
    Satheesh, Sanjeev
    Coates, Adam
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 387 - 391
  • [8] Seq2Seq Surrogates of Epidemic Models to Facilitate Bayesian Inference
    Charles, Giovanni
    Wolock, Timothy M.
    Winskill, Peter
    Ghani, Azra
    Bhatt, Samir
    Flaxman, Seth
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 12, 2023, : 14170 - 14177
  • [9] Dissecting Speech Planning and Articulation Circuits Using Seq2Seq Models
    Singh, Aditya
    Thomas, Tessy Mariam
    Tandon, Nitin
    Li, Jinlong
    NEUROSURGERY, 2025, 71 : 129 - 129
  • [10] A Neural Virtual Anchor Synthesizer based on Seq2Seq and GAN Models
    Wang, Zipeng
    Liu, Zhaoxiang
    Chen, Zezhou
    Hu, Huan
    Lian, Shiguo
    ADJUNCT PROCEEDINGS OF THE 2019 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY (ISMAR-ADJUNCT 2019), 2019, : 233 - 236