A Primer on Seq2Seq Models for Generative Chatbots

被引:2
|
作者
Scotti, Vincenzo [1 ]
Sbattella, Licia [1 ]
Tedesco, Roberto [1 ]
机构
[1] Politecn Milan, DEIB, Via Golgi 42, I-20133 Milan, MI, Italy
关键词
Natural Language Processing; Seq2Seq; LanguageModel; generative chatbot; Open-domain dialogue;
D O I
10.1145/3604281
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The recent spread of Deep Learning-based solutions for Artificial Intelligence and the development of Large Language Models has pushed forwards significantly the Natural Language Processing area. The approach has quickly evolved in the last ten years, deeply affecting NLP, from low-level text pre-processing tasks such as tokenisation or POS tagging- to high-level, complex NLP applications like machine translation and chatbots. This article examines recent trends in the development of open-domain data-driven generative chatbots, focusing on the Seq2Seq architectures. Such architectures are compatible with multiple learning approaches, ranging from supervised to reinforcement and, in the last years, allowed to realise very engaging open-domain chatbots. Not only do these architectures allowto directly output the next turn in a conversation but, to some extent, they also allow to control the style or content of the response. To offer a complete view on the subject, we examine possible architecture implementations as well as training and evaluation approaches. Additionally, we provide information about the openly available corpora to train and evaluate such models and about the current and past chatbot competitions. Finally, we present some insights on possible future directions, given the current research status.
引用
收藏
页数:58
相关论文
共 50 条
  • [31] Seq2Seq模型的短期水位预测
    刘艳
    张婷
    康爱卿
    李建柱
    雷晓辉
    水利水电科技进展, 2022, 42 (03) : 57 - 63
  • [32] Smoothing and Shrinking the Sparse Seq2Seq Search Space
    Peters, Ben
    Martins, Andre F. T.
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 2642 - 2654
  • [33] Gallagher at SemEval-2023 Task 5: Tackling Clickbait with Seq2Seq Models
    Bilgis, Tugay
    Bozdag, Nimet Beyza
    Bethard, Steven
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 1650 - 1655
  • [34] A Chinese text corrector based on seq2seq model
    Gu, Sunyan
    Lang, Fei
    2017 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC), 2017, : 322 - 325
  • [35] Sliding Window Seq2seq Modeling for Engagement Estimation
    Yu, Jun
    Lu, Keda
    Jing, Mohan
    Liang, Ziqi
    Zhang, Bingyuan
    Sun, Jianqing
    Liang, Jiaen
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 9496 - 9500
  • [36] From Disjoint Sets to Parallel Data to Train Seq2Seq Models for Sentiment Transfer
    Cavalin, Paulo
    Vasconcelos, Marisa
    Grave, Marcelo
    Pinhanez, Claudio
    Henrique, Victor
    Ribeiro, Alves
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 689 - 698
  • [37] SparQL Query Prediction Based on Seq2Seq Model
    Yang D.-H.
    Zou K.-F.
    Wang H.-Z.
    Wang J.-B.
    Ruan Jian Xue Bao/Journal of Software, 2021, 32 (03): : 805 - 817
  • [38] Untargeted Code Authorship Evasion with Seq2Seq Transformation
    Choi, Soohyeon
    Jang, Rhongho
    Nyang, DaeHun
    Mohaisen, David
    arXiv, 2023,
  • [39] Exaggerated Portrait Caricatures Generation Based On Seq2Seq
    Xu, Kun
    Tang, Chenwei
    Lv, Jiancheng
    He, Zhenan
    2019 9TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST2019), 2019, : 36 - 44
  • [40] Guesswork for Inference in Machine Translation with Seq2seq Model
    Liu, Lilian
    Malak, Derya
    Medard, Muriel
    2019 IEEE INFORMATION THEORY WORKSHOP (ITW), 2019, : 60 - 64