A Study Case of Automatic Archival Research and Compilation using Large Language Models

被引:0
|
作者
Guo, Dongsheng [1 ]
Yue, Aizhen [1 ]
Ning, Fanggang [2 ]
Huang, Dengrong [1 ]
Chang, Bingxin [1 ]
Duan, Qiang [1 ]
Zhang, Lianchao [2 ]
Chen, Zhaoliang [2 ]
Zhang, Zheng [1 ]
Zhan, Enhao [1 ]
Zhang, Qilai [1 ]
Jiang, Kai [1 ]
Li, Rui [1 ]
Zhao, Shaoxiang [2 ]
Wei, Zizhong [1 ]
机构
[1] Inspur Acad Sci & Technol, Jinan, Shandong, Peoples R China
[2] Inspur Software Co Ltd, Jinan, Shandong, Peoples R China
关键词
Archival research and compilation; Automatic method; Large language models; Fine-tuning;
D O I
10.1109/ICKG59574.2023.00012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Archival research and compilation is a specialized task that focuses on exploration, selection and processing of vast quantities of archival documents pertaining to specific subjects. Traditionally, this task has been characterized by its labor-intensive and time-consuming requirements. In recent years, the advancement of artificial intelligence has made automatic archival research and compilation tasks feasible. However, the limited availability of relevant samples imposes significant constraints on the application of deep learning models, given their high demand for sufficient data and knowledge. In this paper, we present a study case and propose an innovative method for automatic archival research and compilation, leveraging the robust knowledge base and text generation ability offered by large language models. Specifically, our method comprises three essential components: document retrieval, document summarization, and rule-based compilation. In the document summarization component, we leverage fine-tuned large language models to enhance the performance by simulation data generation and summary generation. Experimental results substantiate the effectiveness of our method. Furthermore, our method provides a general idea in using large language models, as well as a solution for addressing similar challenges in different domains.
引用
收藏
页码:52 / 59
页数:8
相关论文
共 50 条
  • [21] AUTOMATIC ASSESSMENT OF THE SCALE OF PRODROMAL SYMPTOMS (SOPS) USING LARGE LANGUAGE MODELS
    Agurto, Carla
    Castro, Eduardo
    Reinen, Jenna
    Mohandass, Dheshan
    Srivastava, Agrima
    Penzel, Nora
    Polosecki, Pablo
    Bilgrami, Zarina
    Liebenthal, Einat
    Woods, Scott
    Shenton, Martha
    Kahn, Rene
    McGorry, Patrick
    Kane, John
    Bearden, Carrie E.
    Pasternak, Ofer
    Cecchi, Guillermo
    Wolff, Phillip
    Mizrahi, Romina
    Nelson, Barnaby
    Corcoran, Cheryl
    NEUROPSYCHOPHARMACOLOGY, 2024, 49 : 527 - 528
  • [22] Assessing the proficiency of large language models in automatic feedback generation: An evaluation study
    Dai, Wei
    Tsai, Yi-Shan
    Lin, Jionghao
    Aldino, Ahmad
    Jin, Hua
    Li, Tongguang
    Gašević, Dragan
    Chen, Guanliang
    Computers and Education: Artificial Intelligence, 2024, 7
  • [23] LARGE LANGUAGE MODELS FOR RISK OF BIAS ASSESSMENT: A CASE STUDY
    Edwards, M.
    Bishop, E.
    Reddish, K.
    Carr, E.
    di Ruffano, L. Ferrante
    VALUE IN HEALTH, 2024, 27 (12)
  • [24] Probing into the Fairness of Large Language Models: A Case Study of ChatGPT
    Li, Yunqi
    Zhang, Lanjing
    Zhang, Yongfeng
    2024 58TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS, CISS, 2024,
  • [25] Prompting Large Language Models for Automatic Question Tagging
    Xu, Nuojia
    Xue, Dizhan
    Qian, Shengsheng
    Fang, Quan
    Hu, Jun
    MACHINE INTELLIGENCE RESEARCH, 2025,
  • [26] Automatic Scoring of Metaphor Creativity with Large Language Models
    DiStefano, Paul V.
    Patterson, John D.
    Beaty, Roger E.
    CREATIVITY RESEARCH JOURNAL, 2024,
  • [27] Automatic Model Selection with Large Language Models for Reasoning
    Zhao, James Xu
    Xie, Yuxi
    Kawaguchi, Kenji
    He, Junxian
    Xie, Michael Qizhe
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 758 - 783
  • [28] Supporting energy policy research with large language models: A case study in wind energy siting ordinances
    Buster, Grant
    Pinchuk, Pavlo
    Barrons, Jacob
    McKeever, Ryan
    Levine, Aaron
    Lopez, Anthony
    ENERGY AND AI, 2024, 18
  • [29] Toward Reproducing Network Research Results Using Large Language Models
    Xiang, Qiao
    Lin, Yuling
    Fang, Mingjun
    Huang, Bang
    Huang, Siyong
    Wen, Ridi
    Le, Franck
    Kong, Linghe
    Shu, Jiwu
    PROCEEDINGS OF THE 22ND ACM WORKSHOP ON HOT TOPICS IN NETWORKS, HOTNETS 2023, 2023, : 56 - 62
  • [30] Autoregressive Self-Evaluation: A Case Study of Music Generation Using Large Language Models
    Banat, Rerker
    Colton, Simon
    2023 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI, 2023, : 264 - 265