ArithmeticGPT: empowering small-size large language models with advanced arithmetic skills

被引:0
|
作者
Liu, Zitao
Zheng, Ying
Yin, Zhibo
Chen, Jiahao
Liu, Tianqiao
Tian, Mi
Luo, Weiqi
机构
基金
国家重点研发计划;
关键词
Large language models; Problem-solving; Math reasoning; Curriculum learning;
D O I
10.1007/s10994-024-06681-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models (LLMs) have shown remarkable capabilities in understanding and generating language across a wide range of domains. However, their performance in advanced arithmetic calculation remains a significant challenge, especially for small-size LLMs. Therefore, in this paper, we propose ArithmeticGPT, a practical framework designed to enhance the advanced arithmetic skills for small-size LLMs. We carefully curate an arithmetic instruction dataset, ArithInstruct, that is able to teach the small-size LLMs to trigger a self-developed internal calculation API for precise computations without explicit instructions. The advanced arithmetic calculation results are seamlessly generated within natural language sentences. Furthermore, we empirically design a practical three-stage strategy for fine-tuning the small-size LLMs with ArithInstruct to enable the advanced arithmetic skills and keep the models' original abilities such as commonsense reasoning and question answering. We evaluate ArithmeticGPT on six public math related datasets with 17 state-of-the-art LLM baselines and experimental results demonstrate the superiority of our approach. To encourage reproducible research, we make our data and code publicly available at https://github.com/ai4ed/ArithmeticGPT.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] Determining radial efficiency with a large data set by solving small-size linear programs
    Chen, Wen-Chih
    Lai, Sheng-Yung
    ANNALS OF OPERATIONS RESEARCH, 2017, 250 (01) : 147 - 166
  • [32] Determining radial efficiency with a large data set by solving small-size linear programs
    Wen-Chih Chen
    Sheng-Yung Lai
    Annals of Operations Research, 2017, 250 : 147 - 166
  • [33] DEFORMABLE MODELS FOR RANDOM SMALL-SIZE OBJECTS: CASE OF LUNG NODULES IN CT TOMOGRAPHY
    Farag, Amal A.
    Graham, James H.
    Farag, Aly A.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 1090 - 1093
  • [34] Aerodynamic experiments in large-size wind tunnel can be realized in common small-size wind tunnel
    Chen, Mo
    Ma, Han-Dong
    Yuhang Xuebao/Journal of Astronautics, 2005, 26 (02): : 131 - 136
  • [35] SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities
    Zhang, Dong
    Li, Shimin
    Zhang, Xin
    Zhan, Jun
    Wang, Pengyu
    Zhou, Yaqian
    Qiu, Xipeng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 15757 - 15773
  • [36] Advanced Default Risk Prediction in Small and Medum-Sized Enterprises Using Large Language Models
    Huang, Haonan
    Li, Jing
    Zheng, Chundan
    Chen, Sikang
    Wang, Xuanyin
    Chen, Xingyan
    APPLIED SCIENCES-BASEL, 2025, 15 (05):
  • [37] Empowering Faculty to Incorporate Large Language Models in Nursing Education Using a Delegation Framework
    Blomquist, Jason
    Llewellyn, Sarah
    Alderden, Jenny
    Connor, Kelley
    NURSING EDUCATION PERSPECTIVES, 2025, 46 (02) : 126 - 128
  • [38] Empowering patients: how accurate and readable are large language models in renal cancer education
    Halawani, Abdulghafour
    Almehmadi, Sultan G.
    Alhubaishy, Bandar A.
    Alnefaie, Ziyad A.
    Hasan, Mudhar N.
    FRONTIERS IN ONCOLOGY, 2024, 14
  • [39] Question Generation Capabilities of "Small" Large Language Models
    Berger, Joshua
    Koss, Jonathan
    Stamatakis, Markos
    Hoppe, Anett
    Ewerth, Ralph
    Wartenal, Christian
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT II, NLDB 2024, 2024, 14763 : 183 - 194
  • [40] Use of large language models might affect our cognitive skills
    Heersmink, Richard
    NATURE HUMAN BEHAVIOUR, 2024, 8 (05) : 805 - 806