A Study Case of Automatic Archival Research and Compilation using Large Language Models

被引:0
|
作者
Guo, Dongsheng [1 ]
Yue, Aizhen [1 ]
Ning, Fanggang [2 ]
Huang, Dengrong [1 ]
Chang, Bingxin [1 ]
Duan, Qiang [1 ]
Zhang, Lianchao [2 ]
Chen, Zhaoliang [2 ]
Zhang, Zheng [1 ]
Zhan, Enhao [1 ]
Zhang, Qilai [1 ]
Jiang, Kai [1 ]
Li, Rui [1 ]
Zhao, Shaoxiang [2 ]
Wei, Zizhong [1 ]
机构
[1] Inspur Acad Sci & Technol, Jinan, Shandong, Peoples R China
[2] Inspur Software Co Ltd, Jinan, Shandong, Peoples R China
关键词
Archival research and compilation; Automatic method; Large language models; Fine-tuning;
D O I
10.1109/ICKG59574.2023.00012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Archival research and compilation is a specialized task that focuses on exploration, selection and processing of vast quantities of archival documents pertaining to specific subjects. Traditionally, this task has been characterized by its labor-intensive and time-consuming requirements. In recent years, the advancement of artificial intelligence has made automatic archival research and compilation tasks feasible. However, the limited availability of relevant samples imposes significant constraints on the application of deep learning models, given their high demand for sufficient data and knowledge. In this paper, we present a study case and propose an innovative method for automatic archival research and compilation, leveraging the robust knowledge base and text generation ability offered by large language models. Specifically, our method comprises three essential components: document retrieval, document summarization, and rule-based compilation. In the document summarization component, we leverage fine-tuned large language models to enhance the performance by simulation data generation and summary generation. Experimental results substantiate the effectiveness of our method. Furthermore, our method provides a general idea in using large language models, as well as a solution for addressing similar challenges in different domains.
引用
收藏
页码:52 / 59
页数:8
相关论文
共 50 条
  • [31] A case study for automated attribute extraction from legal documents using large language models
    Adhikary, Subinay
    Sen, Procheta
    Roy, Dwaipayan
    Ghosh, Kripabandhu
    ARTIFICIAL INTELLIGENCE AND LAW, 2024,
  • [32] Automatic Grading of Short Answers Using Large Language Models in Software Engineering Courses
    Duong, Ta Nguyen Binh
    Meng, Chai Yi
    2024 IEEE GLOBAL ENGINEERING EDUCATION CONFERENCE, EDUCON 2024, 2024,
  • [33] An Exploratory Study on Using Large Language Models for Mutation Testing
    Wang, Bo
    Chen, Mingda
    Lin, Youfang
    Papadakis, Mike
    Zhang, Jie M.
    arXiv,
  • [34] Autonomous chemical research with large language models
    Boiko, Daniil A.
    Macknight, Robert
    Kline, Ben
    Gomes, Gabe
    NATURE, 2023, 624 (7992) : 570 - +
  • [35] Autonomous chemical research with large language models
    Daniil A. Boiko
    Robert MacKnight
    Ben Kline
    Gabe Gomes
    Nature, 2023, 624 : 570 - 578
  • [36] Debiasing large language models: research opportunities
    Yogarajan, Vithya
    Dobbie, Gillian
    Keegan, Te Taka
    JOURNAL OF THE ROYAL SOCIETY OF NEW ZEALAND, 2025, 55 (02) : 372 - 395
  • [37] Research and Application of Large Language Models in Healthcare
    Zhou, Chunfang
    Gong, Qingyue
    Zhu, Jinyang
    Luan, Huidan
    PROCEEDINGS OF 2023 4TH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE FOR MEDICINE SCIENCE, ISAIMS 2023, 2023, : 664 - 670
  • [38] Performance and Accuracy Research of the Large Language Models
    Gaitan, Nicoleta Cristina
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (08) : 62 - 69
  • [39] Demystifying large language models in second language development research
    Cong, Yan
    COMPUTER SPEECH AND LANGUAGE, 2025, 89
  • [40] LARGE LANGUAGE MODELS FOR DATA EXTRACTION IN A SYSTEMATIC REVIEW: A CASE STUDY
    Edwards, M.
    di Ruffano, L. Ferrante
    VALUE IN HEALTH, 2024, 27 (12)