Self-attention in Knowledge Tracing: Why It Works

被引：3

作者：

Pu, Shi ^{[1
]}

Becker, Lee ^{[1
]}

机构：

[1] Educ Testing Serv, 660 Rosedale Rd, Princeton, NJ 08540 USA

来源：

ARTIFICIAL INTELLIGENCE IN EDUCATION, PT I | 2022年 / 13355卷

关键词：

Deep knowledge tracing; Self-attention; Knowledge tracing;

D O I：

10.1007/978-3-031-11644-5_75

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Knowledge tracing refers to the dynamic assessment of a learner's mastery of skills. There has been widespread adoption of the self-attention mechanism in knowledge-tracing models in recent years. These models consistently report performance gains over baseline knowledge tracing models in public datasets. However, why the self-attention mechanism works in knowledge tracing is unknown. This study argues that the ability to encode when a learner attempts to answer the same item multiple times in a row (henceforth referred to as repeated attempts) is a significant reason why self-attention models perform better than other deep knowledge tracing models. We present two experiments to support our argument. We use context-aware knowledge tracing (AKT) as our example self-attention model and dynamic key-value memory networks (DKVMN) and deep performance factors analysis (DPFA) as our baseline models. Firstly, we show that removing repeated attempts from datasets closes the performance gap between the AKT and the baseline models. Secondly, we present DPFA+, an extension of DPFA that is able to consume manually crafted repeated attempts features. We demonstrate that DPFA+ performs better than AKT across all datasets with manually crafted repeated attempts features.

引用

页码：731 / 736

页数：6

共 50 条

[21] On the Integration of Self-Attention and Convolution
Pan, Xuran
Ge, Chunjiang
Lu, Rui
Song, Shiji
Chen, Guanfu
Huang, Zeyi
Huang, Gao
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 805 - 815
[22] On The Computational Complexity of Self-Attention
Keles, Feyza Duman
Wijewardena, Pruthuvi Mahesakya
Hegde, Chinmay
INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 597 - 619
[23] The Lipschitz Constant of Self-Attention
Kim, Hyunjik
Papamakarios, George
Mnih, Andriy
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[24] The function of the self-attention network
Cunningham, Sheila J.
COGNITIVE NEUROSCIENCE, 2016, 7 (1-4) : 21 - 22
[25] Convolutional Self-Attention Networks
Yang, Baosong
Wang, Longyue
Wong, Derek F.
Chao, Lidia S.
Tu, Zhaopeng
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4040 - 4045
[26] Self-Attention Graph Pooling
Lee, Junhyun
Lee, Inyeop
Kang, Jaewoo
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[27] FOCUS OF ATTENTION IN GROUPS - A SELF-ATTENTION PERSPECTIVE
MULLEN, B
CHAPMAN, JG
PEAUGH, S
JOURNAL OF SOCIAL PSYCHOLOGY, 1989, 129 (06): : 807 - 817
[28] Masked face recognition based on knowledge distillation and convolutional self-attention network
Wan, Weiguo
Wen, Runlin
Yao, Li
Yang, Yong
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, : 2269 - 2284
[29] SEMANTIC IMAGES SEGMENTATION FOR AUTONOMOUS DRIVING USING SELF-ATTENTION KNOWLEDGE DISTILLATION
Karine, Ayoub
Napoleon, Thibault
Jridi, Maher
2022 16TH INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY & INTERNET-BASED SYSTEMS, SITIS, 2022, : 198 - 202
[30] Knowledge-Aware Self-Attention Networks for Document Grounded Dialogue Generation
Tang, Xiangru
Hu, Po
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT II, 2019, 11776 : 400 - 411

← 1 2 3 4 5 →