A Study Case of Automatic Archival Research and Compilation using Large Language Models

被引:0
|
作者
Guo, Dongsheng [1 ]
Yue, Aizhen [1 ]
Ning, Fanggang [2 ]
Huang, Dengrong [1 ]
Chang, Bingxin [1 ]
Duan, Qiang [1 ]
Zhang, Lianchao [2 ]
Chen, Zhaoliang [2 ]
Zhang, Zheng [1 ]
Zhan, Enhao [1 ]
Zhang, Qilai [1 ]
Jiang, Kai [1 ]
Li, Rui [1 ]
Zhao, Shaoxiang [2 ]
Wei, Zizhong [1 ]
机构
[1] Inspur Acad Sci & Technol, Jinan, Shandong, Peoples R China
[2] Inspur Software Co Ltd, Jinan, Shandong, Peoples R China
关键词
Archival research and compilation; Automatic method; Large language models; Fine-tuning;
D O I
10.1109/ICKG59574.2023.00012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Archival research and compilation is a specialized task that focuses on exploration, selection and processing of vast quantities of archival documents pertaining to specific subjects. Traditionally, this task has been characterized by its labor-intensive and time-consuming requirements. In recent years, the advancement of artificial intelligence has made automatic archival research and compilation tasks feasible. However, the limited availability of relevant samples imposes significant constraints on the application of deep learning models, given their high demand for sufficient data and knowledge. In this paper, we present a study case and propose an innovative method for automatic archival research and compilation, leveraging the robust knowledge base and text generation ability offered by large language models. Specifically, our method comprises three essential components: document retrieval, document summarization, and rule-based compilation. In the document summarization component, we leverage fine-tuned large language models to enhance the performance by simulation data generation and summary generation. Experimental results substantiate the effectiveness of our method. Furthermore, our method provides a general idea in using large language models, as well as a solution for addressing similar challenges in different domains.
引用
收藏
页码:52 / 59
页数:8
相关论文
共 50 条
  • [1] A Closer Look into Automatic Evaluation Using Large Language Models
    Chiang, Cheng-Han
    Lee, Hung-yi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 8928 - 8942
  • [2] Improving Automatic VQA Evaluation Using Large Language Models
    Manas, Oscar
    Krojer, Benno
    Agrawal, Aishwarya
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5, 2024, : 4171 - 4179
  • [3] Fostering websites accessibility: A case study on the use of the Large Language Models ChatGPT for automatic remediation
    Othman, Achraf
    Dhouib, Amira
    Al Jabor, Aljazi Nasser
    PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS, PETRA 2023, 2023, : 707 - 713
  • [4] Revolutionising Theatre Archives: using Large Language Models to Interact with Structured Archival Content
    Tsoukala, Chara
    Paraskevopoulos, Georgios
    Katsamanis, Athanasios
    ERCIM NEWS, 2024, (136): : 37 - 38
  • [5] Evaluating Large Language Models in Generating Synthetic HCI Research Data: a Case Study
    Hamalainen, Perttu
    Tavast, Mikke
    Kunnari, Anton
    PROCEEDINGS OF THE 2023 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2023, 2023,
  • [6] Automatic Unit Test Code Generation Using Large Language Models
    Ocal, Akdeniz Kutay
    Keskinoz, Mehmet
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [7] Automatic Evaluation of Attribution by Large Language Models
    Yue, Xiang
    Wang, Boshi
    Chen, Ziru
    Zhang, Kai
    Su, Yu
    Sun, Huan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 4615 - 4635
  • [8] Standardized patient profile review using large language models for case adjudication in observational research
    Schuemie, Martijn J.
    Ostropolets, Anna
    Zhuk, Aleh
    Korsik, Uladzislau
    Seo, Seung In
    Suchard, Marc A.
    Hripcsak, George
    Ryan, Patrick B.
    NPJ DIGITAL MEDICINE, 2025, 8 (01):
  • [9] Perils and opportunities in using large language models in psychological research
    Abdurahman, Suhaib
    Atari, Mohammad
    Karimi-Malekabadi, Farzan
    Xue, Mona J.
    Trager, Jackson
    Park, Peter S.
    Golazizian, Preni
    Omrani, Ali
    Dehghani, Morteza
    PNAS NEXUS, 2024, 3 (07):
  • [10] SkillsInterpreter: A Case Study of Automatic Annotation of Flowcharts to Support Browsing Instructional Videos in Modern Martial Arts using Large Language Models
    Oomori, Kotaro
    Ishiguro, Yoshio
    Rekimoto, Jun
    AUGMENTED HUMANS 2024, AHS 2024, 2024, : 217 - 225