A Study Case of Automatic Archival Research and Compilation using Large Language Models

被引:0
|
作者
Guo, Dongsheng [1 ]
Yue, Aizhen [1 ]
Ning, Fanggang [2 ]
Huang, Dengrong [1 ]
Chang, Bingxin [1 ]
Duan, Qiang [1 ]
Zhang, Lianchao [2 ]
Chen, Zhaoliang [2 ]
Zhang, Zheng [1 ]
Zhan, Enhao [1 ]
Zhang, Qilai [1 ]
Jiang, Kai [1 ]
Li, Rui [1 ]
Zhao, Shaoxiang [2 ]
Wei, Zizhong [1 ]
机构
[1] Inspur Acad Sci & Technol, Jinan, Shandong, Peoples R China
[2] Inspur Software Co Ltd, Jinan, Shandong, Peoples R China
关键词
Archival research and compilation; Automatic method; Large language models; Fine-tuning;
D O I
10.1109/ICKG59574.2023.00012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Archival research and compilation is a specialized task that focuses on exploration, selection and processing of vast quantities of archival documents pertaining to specific subjects. Traditionally, this task has been characterized by its labor-intensive and time-consuming requirements. In recent years, the advancement of artificial intelligence has made automatic archival research and compilation tasks feasible. However, the limited availability of relevant samples imposes significant constraints on the application of deep learning models, given their high demand for sufficient data and knowledge. In this paper, we present a study case and propose an innovative method for automatic archival research and compilation, leveraging the robust knowledge base and text generation ability offered by large language models. Specifically, our method comprises three essential components: document retrieval, document summarization, and rule-based compilation. In the document summarization component, we leverage fine-tuned large language models to enhance the performance by simulation data generation and summary generation. Experimental results substantiate the effectiveness of our method. Furthermore, our method provides a general idea in using large language models, as well as a solution for addressing similar challenges in different domains.
引用
收藏
页码:52 / 59
页数:8
相关论文
共 50 条
  • [11] Bioregulatory event extraction using large language models: a case study of rice literature
    Xinzhi Yao
    Zhihan He
    Jingbo Xia
    Genomics & Informatics, 22 (1)
  • [12] Scaling Clinical Trial Matching Using Large Language Models: A Case Study in Oncology
    Wong, Cliff
    Zhang, Sheng
    Gu, Yu
    Moung, Christine
    Abel, Jacob
    Usuyama, Naoto
    Weerasinghe, Roshanthi
    Piening, Brian
    Naumann, Tristan
    Bifulco, Carlo
    Poon, Hoifung
    MACHINE LEARNING FOR HEALTHCARE CONFERENCE, VOL 219, 2023, 219
  • [13] Safety analysis in the era of large language models: A case study of STPA using ChatGPT
    Qi, Yi
    Zhao, Xingyu
    Khastgir, Siddartha
    Huang, Xiaowei
    MACHINE LEARNING WITH APPLICATIONS, 2025, 19
  • [14] Relation extraction using large language models: a case study on acupuncture point locations
    Li, Yiming
    Peng, Xueqing
    Li, Jianfu
    Zuo, Xu
    Peng, Suyuan
    Pei, Donghong
    Tao, Cui
    Xu, Hua
    Hong, Na
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (11) : 2622 - 2631
  • [15] Towards Automatic Mapping of Vulnerabilities to Attack Patterns using Large Language Models
    Das, Siddhartha Shankar
    Dutta, Ashutosh
    Purohit, Sumit
    Serra, Edoardo
    Halappanavar, Mahantesh
    Pothen, Alex
    2022 IEEE INTERNATIONAL SYMPOSIUM ON TECHNOLOGIES FOR HOMELAND SECURITY (HST), 2022,
  • [16] Automatic Generation of Programming Exercises and Code Explanations Using Large Language Models
    Sarsa, Sami
    Denny, Paul
    Hellas, Arto
    Leinonen, Juho
    PROCEEDINGS OF THE 2022 ACM CONFERENCE ON INTERNATIONAL COMPUTING EDUCATION RESEARCH, ICER 2022, VOL. 1, 2023, : 27 - 43
  • [17] Automatic detection of contextual laterality in Mammography Reports using Large Language Models
    Godoy, Eduardo
    de Ferrari, Joaquin
    Mellado, Diego
    Chabert, Steren
    Salas, Rodrigo
    2024 14TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION SYSTEMS, ICPRS, 2024,
  • [18] AutoTurb: Using large language models for automatic algebraic turbulence model discovery
    Zhang, Yu
    Zheng, Kefeng
    Liu, Fei
    Zhang, Qingfu
    Wang, Zhenkun
    PHYSICS OF FLUIDS, 2025, 37 (01)
  • [19] Automatic instantiation of assurance cases from patterns using large language models
    Odu, Oluwafemi
    Belle, Alvine B.
    Wang, Song
    Kpodjedo, Segla
    Lethbridge, Timothy C.
    Hemmati, Hadi
    JOURNAL OF SYSTEMS AND SOFTWARE, 2025, 222
  • [20] Towards Automatic Evaluation of NLG Tasks Using Conversational Large Language Models
    Riyadh, Md
    Shafiq, M. Omair
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2023, PT II, 2023, 676 : 425 - 437