Integrating Small Language Models with Retrieval-Augmented Generation in Computing Education: Key Takeaways, Setup, and Practical Insights

被引：0

作者：

Yu, Zezhu ^{[1
]}

Liu, Suqing ^{[2
]}

Denny, Paul ^{[3
]}

Bergen, Andreas ^{[1
]}

Liut, Michael ^{[1
]}

机构：

[1] Univ Toronto Mississauga, Mississauga, ON, Canada

[2] McMaster Univ, Hamilton, ON, Canada

[3] Univ Auckland, Auckland, New Zealand

来源：

PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 2 | 2025年

基金：

加拿大自然科学与工程研究理事会;

关键词：

Small Language Models; Large Language Models; Retrieval-Augmented Generation; Milvus; Intelligence Concentration; Conversational Agent; Personalized AI Agent; Intelligent Tutoring System; Computing Education; Computer Science Education;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Leveraging a Large Language Model (LLM) for personalized learning in computing education is promising, yet cloud-based LLMs pose risks around data security and privacy. To address these concerns, we developed and deployed a locally stored Small Language Model (SLM) utilizing Retrieval-Augmented Generation (RAG) methods to support computing students' learning. Previous work has demonstrated that SLMs can match or surpass popular LLMs (gpt-3.5-turbo and gpt-4-32k) in handling conversational data from a CS1 course. We deployed SLMs with RAG (SLM + RAG) in a large course with more than 250 active students, fielding nearly 2,000 student questions, while evaluating data privacy, scalability, and feasibility of local deployments. This paper provides a comprehensive guide for deploying SLM + RAG systems, detailing model selection, vector database choice, embedding methods, and pipeline frameworks. We share practical insights from our deployment, including scalability concerns, accuracy versus context length trade-offs, guardrails and hallucination reduction, as well as data privacy maintenance. We address the "Impossible Triangle" in RAG systems, which states that achieving high accuracy, short context length, and low time consumption simultaneously is not feasible. Furthermore, our novel RAG framework, Intelligence Concentration (IC), categorizes information into multiple layers of abstraction within Milvus collections mitigating trade-offs and enabling educational assistants to deliver more relevant and personalized responses to students quickly.

引用

页码：1302 / 1308

页数：7

共 31 条

[1] Integrating Small Language Models with Retrieval-Augmented Generation in Computing Education: Key Takeaways, Setup, and Practical Insights
Yu, Zezhu
Liu, Suqing
Denny, Paul
Bergen, Andreas
Liut, Michael
PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 1, 2025, : 1302 - 1308
[2] Integrating Retrieval-Augmented Generation with Large Language Models in Nephrology: Advancing Practical Applications
Miao, Jing
Thongprayoon, Charat
Suppadungsuk, Supawadee
Valencia, Oscar A. Garcia
Cheungpasitporn, Wisit
MEDICINA-LITHUANIA, 2024, 60 (03):
[3] Integrating Graph Retrieval-Augmented Generation With Large Language Models for Supplier Discovery
Li, Yunqing
Ko, Hyunwoong
Ameri, Farhad
JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2025, 25 (02)
[4] Benchmarking Large Language Models in Retrieval-Augmented Generation
Chen, Jiawei
Lin, Hongyu
Han, Xianpei
Sun, Le
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17754 - 17762
[5] Optimized interaction with Large Language Models: A practical guide to Prompt Engineering and Retrieval-Augmented Generation
Fink, Anna
Rau, Alexander
Kotter, Elmar
Bamberg, Fabian
Russe, Maximilian Frederik
RADIOLOGIE, 2025,
[6] Retrieval-augmented generation versus document-grounded generation: a key distinction in large language models
Hewitt, Katherine J.
Wiest, Isabella C.
Kather, Jakob N.
JOURNAL OF PATHOLOGY CLINICAL RESEARCH, 2025, 11 (01):
[7] Quantitative Evaluation of Using Large Language Models and Retrieval-Augmented Generation in Computer Science Education
Wang, Kevin Shukang
Lawrence, Ramon
PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 2, 2025, : 1183 - 1189
[8] Quantitative Evaluation of Using Large Language Models and Retrieval-Augmented Generation in Computer Science Education
Wang, Kevin Shukang
Lawrence, Ramon
PROCEEDINGS OF THE 56TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, SIGCSE TS 2025, VOL 1, 2025, : 1183 - 1189
[9] Can Small Language Models With Retrieval-Augmented Generation Replace Large Language Models When Learning Computer Science?
Liu, Suqing
Yu, Zezhu
Huang, Feiran
Bulbulia, Yousef
Bergen, Andreas
Liut, Michael
PROCEEDINGS OF THE 2024 CONFERENCE INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, VOL 1, ITICSE 2024, 2024, : 388 - 393
[10] Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy
Shaol, Zhihong
Gong, Yeyun
Shen, Yelong
Huang, Minlie
Duane, Nan
Chen, Weizhu
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9248 - 9274

← 1 2 3 4 →