A Primer on Seq2Seq Models for Generative Chatbots

被引：2

作者：

Scotti, Vincenzo ^{[1
]}

Sbattella, Licia ^{[1
]}

Tedesco, Roberto ^{[1
]}

机构：

[1] Politecn Milan, DEIB, Via Golgi 42, I-20133 Milan, MI, Italy

来源：

ACM COMPUTING SURVEYS | 2024年 / 56卷 / 03期

关键词：

Natural Language Processing; Seq2Seq; LanguageModel; generative chatbot; Open-domain dialogue;

D O I：

10.1145/3604281

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The recent spread of Deep Learning-based solutions for Artificial Intelligence and the development of Large Language Models has pushed forwards significantly the Natural Language Processing area. The approach has quickly evolved in the last ten years, deeply affecting NLP, from low-level text pre-processing tasks such as tokenisation or POS tagging- to high-level, complex NLP applications like machine translation and chatbots. This article examines recent trends in the development of open-domain data-driven generative chatbots, focusing on the Seq2Seq architectures. Such architectures are compatible with multiple learning approaches, ranging from supervised to reinforcement and, in the last years, allowed to realise very engaging open-domain chatbots. Not only do these architectures allowto directly output the next turn in a conversation but, to some extent, they also allow to control the style or content of the response. To offer a complete view on the subject, we examine possible architecture implementations as well as training and evaluation approaches. Additionally, we provide information about the openly available corpora to train and evaluate such models and about the current and past chatbot competitions. Finally, we present some insights on possible future directions, given the current research status.

引用

页数：58

共 50 条

[1] Sparsing and Smoothing for the seq2seq Models
Zhao S.
Liang Z.
Wen J.
Chen J.
IEEE Transactions on Artificial Intelligence, 2023, 4 (03): : 464 - 472
[2] Application of Seq2Seq Models on Code Correction
Huang, Shan
Zhou, Xiao
Chin, Sang
FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2021, 4
[3] Linguistic Descriptions of Human Motion with Generative Adversarial Seq2Seq Learning
Goutsu, Yusuke
Inamura, Tetsunari
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4281 - 4287
[4] Seq2Seq models for recommending short text conversations
Torres, Johnny
Vaca, Carmen
Teran, Luis
Abad, Cristina L.
EXPERT SYSTEMS WITH APPLICATIONS, 2020, 150
[5] Seq2Seq Deep Learning Models for Microtext Normalization
Satapathy, Ranjan
Li, Yang
Cavallari, Sandro
Cambria, Erik
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[6] Learning Transductions and Alignments with RNN Seq2seq Models
Wang, Zhengxiang
INTERNATIONAL CONFERENCE ON GRAMMATICAL INFERENCE, VOL 217, 2023, 217 : 223 - 249
[7] Cold Fusion: Training Seq2Seq Models Together with Language Models
Sriram, Anuroop
Jun, Heewoo
Satheesh, Sanjeev
Coates, Adam
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 387 - 391
[8] Seq2Seq Surrogates of Epidemic Models to Facilitate Bayesian Inference
Charles, Giovanni
Wolock, Timothy M.
Winskill, Peter
Ghani, Azra
Bhatt, Samir
Flaxman, Seth
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 12, 2023, : 14170 - 14177
[9] Dissecting Speech Planning and Articulation Circuits Using Seq2Seq Models
Singh, Aditya
Thomas, Tessy Mariam
Tandon, Nitin
Li, Jinlong
NEUROSURGERY, 2025, 71 : 129 - 129
[10] A Neural Virtual Anchor Synthesizer based on Seq2Seq and GAN Models
Wang, Zipeng
Liu, Zhaoxiang
Chen, Zezhou
Hu, Huan
Lian, Shiguo
ADJUNCT PROCEEDINGS OF THE 2019 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY (ISMAR-ADJUNCT 2019), 2019, : 233 - 236

← 1 2 3 4 5 →