A Primer on Seq2Seq Models for Generative Chatbots

被引：2

作者：

Scotti, Vincenzo ^{[1
]}

Sbattella, Licia ^{[1
]}

Tedesco, Roberto ^{[1
]}

机构：

[1] Politecn Milan, DEIB, Via Golgi 42, I-20133 Milan, MI, Italy

来源：

ACM COMPUTING SURVEYS | 2024年 / 56卷 / 03期

关键词：

Natural Language Processing; Seq2Seq; LanguageModel; generative chatbot; Open-domain dialogue;

D O I：

10.1145/3604281

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The recent spread of Deep Learning-based solutions for Artificial Intelligence and the development of Large Language Models has pushed forwards significantly the Natural Language Processing area. The approach has quickly evolved in the last ten years, deeply affecting NLP, from low-level text pre-processing tasks such as tokenisation or POS tagging- to high-level, complex NLP applications like machine translation and chatbots. This article examines recent trends in the development of open-domain data-driven generative chatbots, focusing on the Seq2Seq architectures. Such architectures are compatible with multiple learning approaches, ranging from supervised to reinforcement and, in the last years, allowed to realise very engaging open-domain chatbots. Not only do these architectures allowto directly output the next turn in a conversation but, to some extent, they also allow to control the style or content of the response. To offer a complete view on the subject, we examine possible architecture implementations as well as training and evaluation approaches. Additionally, we provide information about the openly available corpora to train and evaluate such models and about the current and past chatbot competitions. Finally, we present some insights on possible future directions, given the current research status.

引用

页数：58

共 50 条

[31] Seq2Seq模型的短期水位预测
刘艳
张婷
康爱卿
李建柱
雷晓辉
水利水电科技进展, 2022, 42 (03) : 57 - 63
[32] Smoothing and Shrinking the Sparse Seq2Seq Search Space
Peters, Ben
Martins, Andre F. T.
2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 2642 - 2654
[33] Gallagher at SemEval-2023 Task 5: Tackling Clickbait with Seq2Seq Models
Bilgis, Tugay
Bozdag, Nimet Beyza
Bethard, Steven
17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 1650 - 1655
[34] A Chinese text corrector based on seq2seq model
Gu, Sunyan
Lang, Fei
2017 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC), 2017, : 322 - 325
[35] Sliding Window Seq2seq Modeling for Engagement Estimation
Yu, Jun
Lu, Keda
Jing, Mohan
Liang, Ziqi
Zhang, Bingyuan
Sun, Jianqing
Liang, Jiaen
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 9496 - 9500
[36] From Disjoint Sets to Parallel Data to Train Seq2Seq Models for Sentiment Transfer
Cavalin, Paulo
Vasconcelos, Marisa
Grave, Marcelo
Pinhanez, Claudio
Henrique, Victor
Ribeiro, Alves
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 689 - 698
[37] SparQL Query Prediction Based on Seq2Seq Model
Yang D.-H.
Zou K.-F.
Wang H.-Z.
Wang J.-B.
Ruan Jian Xue Bao/Journal of Software, 2021, 32 (03): : 805 - 817
[38] Untargeted Code Authorship Evasion with Seq2Seq Transformation
Choi, Soohyeon
Jang, Rhongho
Nyang, DaeHun
Mohaisen, David
arXiv, 2023,
[39] Exaggerated Portrait Caricatures Generation Based On Seq2Seq
Xu, Kun
Tang, Chenwei
Lv, Jiancheng
He, Zhenan
2019 9TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST2019), 2019, : 36 - 44
[40] Guesswork for Inference in Machine Translation with Seq2seq Model
Liu, Lilian
Malak, Derya
Medard, Muriel
2019 IEEE INFORMATION THEORY WORKSHOP (ITW), 2019, : 60 - 64

← 1 2 3 4 5 →