Context Compression and Extraction: Efficiency Inference of Large Language Models

被引:0
|
作者
Zhou, Junyao [1 ]
Du, Ruiqing [1 ]
Tan, Yushan [2 ]
Yang, Jintao [2 ]
Yang, Zonghao [2 ]
Luo, Wei [2 ]
Luo, Zhunchen [2 ]
Zhou, Xian [2 ]
Hu, Wenpeng [2 ]
机构
[1] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056000, Peoples R China
[2] Acad Mil Sci Peoples Liberat Army, Beijing 1000000, Peoples R China
基金
中国国家自然科学基金;
关键词
self-information; mutual-information; context compression; large language model;
D O I
10.1007/978-981-97-5663-6_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models have shown great capability in dealing with long contexts. However, when applied to question-and-answer response tasks, excessively long contexts unavoidably contain redundant information, which could potentially lead to a loss of significant details. Therefore it is a challenge to retain the information related to the user's query intent in long contexts. To address this problem, our study proposes a novel Context Compression and Extraction (CCE) technique, which takes the impact of the user query into account. CCE computes the mutual information between the query and its context, integrating this with self-information to preserve query-relevant information in the compressed context. We have validated our approach across diverse datasets that require integrated context processing capabilities, such as the arXiv paper dataset and news article dataset. Our methodology exhibits efficacy in various tasks, including summarization, question-answering, and the reconstruction of original contexts. Experimental results validate the superior performance of our method compared to a strong baseline across several evaluation metrics, significantly enhancing the quality of text generated in downstream tasks.
引用
收藏
页码:221 / 232
页数:12
相关论文
共 50 条
  • [31] Adaptive In-Context Learning with Large Language Models for Bundle
    Sun, Zhu
    Feng, Kaidong
    Yang, Jie
    Qu, Xinghua
    Fang, Hui
    Ong, Yew-Soon
    Liu, Wenyuan
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 966 - 976
  • [32] A Comparative Study of Large Language Models for Goal Model Extraction
    Siddeshwar, Vaishali
    Alwidian, Sanaa
    Makrehchi, Masoud
    ACM/IEEE 27TH INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS: COMPANION PROCEEDINGS, MODELS 2024, 2024, : 253 - 263
  • [33] Empirical Analysis of Dialogue Relation Extraction with Large Language Models
    Li, Guozheng
    Xu, Zijie
    Shang, Ziyu
    Liu, Jiajun
    Ji, Ke
    Guo, Yikai
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 6359 - 6367
  • [34] LLMEffiChecker: : Understanding and Testing Efficiency Degradation of Large Language Models
    Feng, Xiaoning
    Han, Xiaohong
    Chen, Simin
    Yang, Wei
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (07)
  • [35] GPTQT: Quantize Large Language Models Twice to Push the Efficiency
    Guo, Yipin
    Lang, Yilin
    Ren, Qinyuan
    2024 IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, CIS AND IEEE INTERNATIONAL CONFERENCE ON ROBOTICS, AUTOMATION AND MECHATRONICS, RAM, CIS-RAM 2024, 2024, : 368 - 373
  • [36] Implications of Large Language Models for Quality and Efficiency of Neurologic Care
    Moura, Lidia
    Jones, David T.
    Sheikh, Irfan S.
    Murphy, Shawn
    Kalfin, Michael
    Kummer, Benjamin R.
    Weathers, Allison L.
    Grinspan, Zachary M.
    Silsbee, Heather M.
    Jones Jr, Lyell K.
    Patel, Anup D.
    NEUROLOGY, 2024, 102 (11) : e209497
  • [37] Understanding the Effect of Model Compression on Social Bias in Large Language Models
    Goncalves, Gustavo
    Strubell, Emma
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 2663 - 2675
  • [38] Employing Large Language Models (LLMs) for study candidate selection and clinical data extraction in the context of retinal research.
    Hernandez, Luis
    Quiroz-Mercado, Hugo
    Fromow-Guerra, Jans
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2024, 65 (07)
  • [39] Layer-Condensed KV Cache for Efficient Inference of Large Language Models
    Wu, Haoyi
    Tu, Kewei
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 11175 - 11188
  • [40] EchoSwift An Inference Benchmarking and Configuration Discovery Tool for Large Language Models (LLMs)
    Krishna, Karthik
    Bandili, Ramana
    COMPANION OF THE 15TH ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, ICPE COMPANION 2024, 2024, : 158 - 162