Dialogue Summarization with Mixture of Experts based on Large Language Models

被引:0
|
作者
Tian, Yuanhe [1 ,2 ]
Xia, Fei [2 ]
Song, Yan [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] Univ Washington, Seattle, WA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dialogue summarization is an important task that requires to generate highlights for a conversation from different aspects (e.g., content of various speakers). While several studies successfully employ large language models (LLMs) and achieve satisfying results, they are limited by using one model at a time or treat it as a black box, which makes it hard to discriminatively learn essential content in a dialogue from different aspects, therefore may lead to anticipation bias and potential loss of information in the produced summaries. In this paper, we propose an LLM-based approach with role-oriented routing and fusion generation to utilize mixture of experts (MoE) for dialogue summarization. Specifically, the role-oriented routing is an LLM-based module that selects appropriate experts to process different information; fusion generation is another LLM-based module to locate salient information and produce finalized dialogue summaries. The proposed approach offers an alternative solution to employing multiple LLMs for dialogue summarization by leveraging their capabilities of in-context processing and generation in an effective manner. We run experiments on widely used benchmark datasets for this task, where the results demonstrate the superiority of our approach in producing informative and accurate dialogue summarization.(1)
引用
收藏
页码:7143 / 7155
页数:13
相关论文
共 50 条
  • [1] Expert evaluation of large language models for clinical dialogue summarization
    Navarro, David Fraile
    Coiera, Enrico
    Hambly, Thomas W.
    Triplett, Zoe
    Asif, Nahyan
    Susanto, Anindya
    Chowdhury, Anamika
    Lorenzo, Amaya Azcoaga
    Dras, Mark
    Berkovsky, Shlomo
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [2] A Comprehensive Evaluation of Large Language Models for Turkish Abstractive Dialogue Summarization
    Buyuk, Osman
    IEEE ACCESS, 2024, 12 : 124391 - 124401
  • [3] Adapted large language models can outperform medical experts in clinical text summarization
    Van Veen, Dave
    Van Uden, Cara
    Blankemeier, Louis
    Delbrouck, Jean-Benoit
    Aali, Asad
    Bluethgen, Christian
    Pareek, Anuj
    Polacin, Malgorzata
    Reis, Eduardo Pontes
    Seehofnerova, Anna
    Rohatgi, Nidhi
    Hosamani, Poonam
    Collins, William
    Ahuja, Neera
    Langlotz, Curtis P.
    Hom, Jason
    Gatidis, Sergios
    Pauly, John
    Chaudhari, Akshay S.
    NATURE MEDICINE, 2024, 30 (04) : 1134 - 1142
  • [4] Adaptive Gating in Mixture-of-Experts based Language Models
    Li, Jiamin
    Su, Qiang
    Yang, Yitao
    Jiang, Yimin
    Wang, Cong
    Xu, Hong
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 3577 - 3587
  • [5] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
    Lu, Xudong
    Liu, Qi
    Xu, Yuhui
    Zhou, Aojun
    Huang, Siyuan
    Zhang, Bo
    Yan, Junchi
    Li, Hongsheng
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 6159 - 6172
  • [6] Fine tuning the large language pegasus model for dialogue summarization
    Vinay Sarthak
    Preeti Rishiwal
    Mano Yadav
    Sushil Yadav
    Ashutosh Gangwar
    undefined Shankdhar
    International Journal of Information Technology, 2025, 17 (2) : 1165 - 1177
  • [7] Effectiveness of French Language Models on Abstractive Dialogue Summarization Task
    Zhou, Yongxin
    Portet, Francois
    Ringeval, Fabien
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3571 - 3581
  • [8] Benchmarking Large Language Models for News Summarization
    Zhang, Tianyi
    Ladhak, Faisal
    Durmus, Esin
    Liang, Percy
    Mckeown, Kathleen
    Hashimoto, Tatsunori B.
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 39 - 57
  • [9] Overcoming language barriers via machine translation with sparse Mixture-of-Experts fusion of large language models
    Zhu, Shaolin
    Jian, Dong
    Xiong, Deyi
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (03)
  • [10] Efficient Inference Offloading for Mixture-of-Experts Large Language Models in Internet of Medical Things
    Yuan, Xiaoming
    Kong, Weixuan
    Luo, Zhenyu
    Xu, Minrui
    ELECTRONICS, 2024, 13 (11)