Integrating Small Language Models with Retrieval-Augmented Generation in Computing Education: Key Takeaways, Setup, and Practical Insights

被引:0
|
作者
Yu, Zezhu [1 ]
Liu, Suqing [2 ]
Denny, Paul [3 ]
Bergen, Andreas [1 ]
Liut, Michael [1 ]
机构
[1] Univ Toronto Mississauga, Mississauga, ON, Canada
[2] McMaster Univ, Hamilton, ON, Canada
[3] Univ Auckland, Auckland, New Zealand
基金
加拿大自然科学与工程研究理事会;
关键词
Small Language Models; Large Language Models; Retrieval-Augmented Generation; Milvus; Intelligence Concentration; Conversational Agent; Personalized AI Agent; Intelligent Tutoring System; Computing Education; Computer Science Education;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Leveraging a Large Language Model (LLM) for personalized learning in computing education is promising, yet cloud-based LLMs pose risks around data security and privacy. To address these concerns, we developed and deployed a locally stored Small Language Model (SLM) utilizing Retrieval-Augmented Generation (RAG) methods to support computing students' learning. Previous work has demonstrated that SLMs can match or surpass popular LLMs (gpt-3.5-turbo and gpt-4-32k) in handling conversational data from a CS1 course. We deployed SLMs with RAG (SLM + RAG) in a large course with more than 250 active students, fielding nearly 2,000 student questions, while evaluating data privacy, scalability, and feasibility of local deployments. This paper provides a comprehensive guide for deploying SLM + RAG systems, detailing model selection, vector database choice, embedding methods, and pipeline frameworks. We share practical insights from our deployment, including scalability concerns, accuracy versus context length trade-offs, guardrails and hallucination reduction, as well as data privacy maintenance. We address the "Impossible Triangle" in RAG systems, which states that achieving high accuracy, short context length, and low time consumption simultaneously is not feasible. Furthermore, our novel RAG framework, Intelligence Concentration (IC), categorizes information into multiple layers of abstraction within Milvus collections mitigating trade-offs and enabling educational assistants to deliver more relevant and personalized responses to students quickly.
引用
收藏
页码:1302 / 1308
页数:7
相关论文
共 31 条
  • [1] Integrating Small Language Models with Retrieval-Augmented Generation in Computing Education: Key Takeaways, Setup, and Practical Insights
    Yu, Zezhu
    Liu, Suqing
    Denny, Paul
    Bergen, Andreas
    Liut, Michael
    PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 1, 2025, : 1302 - 1308
  • [2] Integrating Retrieval-Augmented Generation with Large Language Models in Nephrology: Advancing Practical Applications
    Miao, Jing
    Thongprayoon, Charat
    Suppadungsuk, Supawadee
    Valencia, Oscar A. Garcia
    Cheungpasitporn, Wisit
    MEDICINA-LITHUANIA, 2024, 60 (03):
  • [3] Integrating Graph Retrieval-Augmented Generation With Large Language Models for Supplier Discovery
    Li, Yunqing
    Ko, Hyunwoong
    Ameri, Farhad
    JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2025, 25 (02)
  • [4] Benchmarking Large Language Models in Retrieval-Augmented Generation
    Chen, Jiawei
    Lin, Hongyu
    Han, Xianpei
    Sun, Le
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17754 - 17762
  • [5] Optimized interaction with Large Language Models: A practical guide to Prompt Engineering and Retrieval-Augmented Generation
    Fink, Anna
    Rau, Alexander
    Kotter, Elmar
    Bamberg, Fabian
    Russe, Maximilian Frederik
    RADIOLOGIE, 2025,
  • [6] Retrieval-augmented generation versus document-grounded generation: a key distinction in large language models
    Hewitt, Katherine J.
    Wiest, Isabella C.
    Kather, Jakob N.
    JOURNAL OF PATHOLOGY CLINICAL RESEARCH, 2025, 11 (01):
  • [7] Quantitative Evaluation of Using Large Language Models and Retrieval-Augmented Generation in Computer Science Education
    Wang, Kevin Shukang
    Lawrence, Ramon
    PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 2, 2025, : 1183 - 1189
  • [8] Quantitative Evaluation of Using Large Language Models and Retrieval-Augmented Generation in Computer Science Education
    Wang, Kevin Shukang
    Lawrence, Ramon
    PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 1, 2025, : 1183 - 1189
  • [9] Can Small Language Models With Retrieval-Augmented Generation Replace Large Language Models When Learning Computer Science?
    Liu, Suqing
    Yu, Zezhu
    Huang, Feiran
    Bulbulia, Yousef
    Bergen, Andreas
    Liut, Michael
    PROCEEDINGS OF THE 2024 CONFERENCE INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, VOL 1, ITICSE 2024, 2024, : 388 - 393
  • [10] Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy
    Shaol, Zhihong
    Gong, Yeyun
    Shen, Yelong
    Huang, Minlie
    Duane, Nan
    Chen, Weizhu
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9248 - 9274