Dialogue Summarization with Mixture of Experts based on Large Language Models

被引：0

作者：

Tian, Yuanhe ^{[1
,2
]}

Xia, Fei ^{[2
]}

Song, Yan ^{[1
]}

机构：

[1] Univ Sci & Technol China, Hefei, Peoples R China

[2] Univ Washington, Seattle, WA USA

来源：

PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS | 2024年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Dialogue summarization is an important task that requires to generate highlights for a conversation from different aspects (e.g., content of various speakers). While several studies successfully employ large language models (LLMs) and achieve satisfying results, they are limited by using one model at a time or treat it as a black box, which makes it hard to discriminatively learn essential content in a dialogue from different aspects, therefore may lead to anticipation bias and potential loss of information in the produced summaries. In this paper, we propose an LLM-based approach with role-oriented routing and fusion generation to utilize mixture of experts (MoE) for dialogue summarization. Specifically, the role-oriented routing is an LLM-based module that selects appropriate experts to process different information; fusion generation is another LLM-based module to locate salient information and produce finalized dialogue summaries. The proposed approach offers an alternative solution to employing multiple LLMs for dialogue summarization by leveraging their capabilities of in-context processing and generation in an effective manner. We run experiments on widely used benchmark datasets for this task, where the results demonstrate the superiority of our approach in producing informative and accurate dialogue summarization.(1)

引用

页码：7143 / 7155

页数：13

共 50 条

[1] Expert evaluation of large language models for clinical dialogue summarization
Navarro, David Fraile
Coiera, Enrico
Hambly, Thomas W.
Triplett, Zoe
Asif, Nahyan
Susanto, Anindya
Chowdhury, Anamika
Lorenzo, Amaya Azcoaga
Dras, Mark
Berkovsky, Shlomo
SCIENTIFIC REPORTS, 2025, 15 (01):
[2] A Comprehensive Evaluation of Large Language Models for Turkish Abstractive Dialogue Summarization
Buyuk, Osman
IEEE ACCESS, 2024, 12 : 124391 - 124401
[3] Adapted large language models can outperform medical experts in clinical text summarization
Van Veen, Dave
Van Uden, Cara
Blankemeier, Louis
Delbrouck, Jean-Benoit
Aali, Asad
Bluethgen, Christian
Pareek, Anuj
Polacin, Malgorzata
Reis, Eduardo Pontes
Seehofnerova, Anna
Rohatgi, Nidhi
Hosamani, Poonam
Collins, William
Ahuja, Neera
Langlotz, Curtis P.
Hom, Jason
Gatidis, Sergios
Pauly, John
Chaudhari, Akshay S.
NATURE MEDICINE, 2024, 30 (04) : 1134 - 1142
[4] Adaptive Gating in Mixture-of-Experts based Language Models
Li, Jiamin
Su, Qiang
Yang, Yitao
Jiang, Yimin
Wang, Cong
Xu, Hong
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 3577 - 3587
[5] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
Lu, Xudong
Liu, Qi
Xu, Yuhui
Zhou, Aojun
Huang, Siyuan
Zhang, Bo
Yan, Junchi
Li, Hongsheng
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 6159 - 6172
[6] Fine tuning the large language pegasus model for dialogue summarization
Vinay Sarthak
Preeti Rishiwal
Mano Yadav
Sushil Yadav
Ashutosh Gangwar
undefined Shankdhar
International Journal of Information Technology, 2025, 17 (2) : 1165 - 1177
[7] Effectiveness of French Language Models on Abstractive Dialogue Summarization Task
Zhou, Yongxin
Portet, Francois
Ringeval, Fabien
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3571 - 3581
[8] Benchmarking Large Language Models for News Summarization
Zhang, Tianyi
Ladhak, Faisal
Durmus, Esin
Liang, Percy
Mckeown, Kathleen
Hashimoto, Tatsunori B.
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 39 - 57
[9] Overcoming language barriers via machine translation with sparse Mixture-of-Experts fusion of large language models
Zhu, Shaolin
Jian, Dong
Xiong, Deyi
INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (03)
[10] Efficient Inference Offloading for Mixture-of-Experts Large Language Models in Internet of Medical Things
Yuan, Xiaoming
Kong, Weixuan
Luo, Zhenyu
Xu, Minrui
ELECTRONICS, 2024, 13 (11)

← 1 2 3 4 5 →