Context Compression and Extraction: Efficiency Inference of Large Language Models

被引:0
|
作者
Zhou, Junyao [1 ]
Du, Ruiqing [1 ]
Tan, Yushan [2 ]
Yang, Jintao [2 ]
Yang, Zonghao [2 ]
Luo, Wei [2 ]
Luo, Zhunchen [2 ]
Zhou, Xian [2 ]
Hu, Wenpeng [2 ]
机构
[1] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056000, Peoples R China
[2] Acad Mil Sci Peoples Liberat Army, Beijing 1000000, Peoples R China
基金
中国国家自然科学基金;
关键词
self-information; mutual-information; context compression; large language model;
D O I
10.1007/978-981-97-5663-6_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models have shown great capability in dealing with long contexts. However, when applied to question-and-answer response tasks, excessively long contexts unavoidably contain redundant information, which could potentially lead to a loss of significant details. Therefore it is a challenge to retain the information related to the user's query intent in long contexts. To address this problem, our study proposes a novel Context Compression and Extraction (CCE) technique, which takes the impact of the user query into account. CCE computes the mutual information between the query and its context, integrating this with self-information to preserve query-relevant information in the compressed context. We have validated our approach across diverse datasets that require integrated context processing capabilities, such as the arXiv paper dataset and news article dataset. Our methodology exhibits efficacy in various tasks, including summarization, question-answering, and the reconstruction of original contexts. Experimental results validate the superior performance of our method compared to a strong baseline across several evaluation metrics, significantly enhancing the quality of text generated in downstream tasks.
引用
收藏
页码:221 / 232
页数:12
相关论文
共 50 条
  • [1] Compressing Context to Enhance Inference Efficiency of Large Language Models
    Li, Yucheng
    Dong, Bo
    Guerin, Frank
    Lin, Chenghua
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6342 - 6353
  • [2] Measuring and Improving the Energy Efficiency of Large Language Models Inference
    Argerich, Mauricio Fadel
    Patino-Martinez, Marta
    IEEE ACCESS, 2024, 12 : 80194 - 80207
  • [3] Language Models for Lexical Inference in Context
    Schmitt, Martin
    Schuetze, Hinrich
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1267 - 1280
  • [4] Extending Context Window of Large Language Models via Semantic Compression
    Fei, Weizhi
    Niu, Xueyan
    Zhou, Pingyi
    Hou, Lu
    Bai, Bo
    Deng, Lei
    Han, Wei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 5169 - 5181
  • [5] High Efficiency Image Compression for Large Visual-Language Models
    Li, Binzhe
    Wang, Shurun
    Wang, Shiqi
    Ye, Yan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2870 - 2880
  • [6] Inference to the Best Explanation in Large Language Models
    Dalal, Dhairya
    Valentino, Marco
    Freitas, Andre
    Buitelaar, Paul
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 217 - 235
  • [7] Assessing Inference Time in Large Language Models
    Walkowiak, Bartosz
    Walkowiak, Tomasz
    SYSTEM DEPENDABILITY-THEORY AND APPLICATIONS, DEPCOS-RELCOMEX 2024, 2024, 1026 : 296 - 305
  • [8] A Survey on Model Compression for Large Language Models
    Zhu, Xunyu
    Li, Jian
    Liu, Yong
    Ma, Can
    Wang, Weiping
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 1556 - 1577
  • [9] Sources of Hallucination by Large Language Models on Inference Tasks
    McKenna, Nick
    Li, Tianyi
    Cheng, Liang
    Hosseini, Mohammad Javad
    Johnson, Mark
    Steedman, Mark
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 2758 - 2774
  • [10] GPT-RE: In-context Learning for Relation Extraction using Large Language Models
    Wan, Zhen
    Cheng, Fei
    Mao, Zhuoyuan
    Liu, Qianying
    Song, Haiyue
    Li, Jiwei
    Kurohashi, Sadao
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 3534 - 3547