The Impact of Data Quantity and Source on the Quality of Data-Driven Hints for Programming

被引:5
|
作者
Price, Thomas W. [1 ]
Zhi, Rui [1 ]
Dong, Yihuan [1 ]
Lytle, Nicholas [1 ]
Barnes, Tiffany [1 ]
机构
[1] North Carolina State Univ, Raleigh, NC 27606 USA
基金
美国国家科学基金会;
关键词
Data-driven hints; Programming; Hint quality; Cold start; GENERATION;
D O I
10.1007/978-3-319-93843-1_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the domain of programming, intelligent tutoring systems increasingly employ data-driven methods to automate hint generation. Evaluations of these systems have largely focused on whether they can reliably provide hints for most students, and how much data is needed to do so, rather than how useful the resulting hints are to students. We present a method for evaluating the quality of data-driven hints and how their quality is impacted by the data used to generate them. Using two datasets, we investigate how the quantity of data and the source of data (whether it comes from students or experts) impact one hint generation algorithm. We find that with student training data, hint quality stops improving after 15-20 training solutions and can decrease with additional data. We also find that student data outperforms a single expert solution but that a comprehensive set of expert solutions generally performs best.
引用
收藏
页码:476 / 490
页数:15
相关论文
共 50 条
  • [31] The impact of dataset quality on the performance of data-driven approaches for human activity recognition
    Irvine, Naomi
    Nugent, Chris
    Zhang, Shuai
    Wang, Hui
    Ng, Wing W. Y.
    Cleland, Ian
    Espinilla, Macarena
    DATA SCIENCE AND KNOWLEDGE ENGINEERING FOR SENSING DECISION SUPPORT, 2018, 11 : 1300 - 1308
  • [32] On the impact of regularization in data-driven predictive control
    Breschi, Valentina
    Chiuso, Alessandro
    Fabris, Marco
    Formentin, Simone
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 3061 - 3066
  • [33] The impact of compression on data-driven process analyses
    Thornhill, NF
    Choudhury, MAAS
    Shah, SL
    JOURNAL OF PROCESS CONTROL, 2004, 14 (04) : 389 - 398
  • [34] Food fortification for impact: a data-driven approach
    Neufeld, L. M.
    Aaron, G. J.
    Garrett, G. S.
    Baker, S. K.
    Dary, O.
    Van Ameringen, M.
    BULLETIN OF THE WORLD HEALTH ORGANIZATION, 2016, 94 (08) : 631 - 632
  • [35] Data-Driven System Dynamics Model for Simulating Water Quantity and Quality in Peri-Urban Streams
    Lemaire, Gregory G.
    Carnohan, Shane A.
    Grand, Stanislav
    Mazel, Victor
    Bjerg, Poul L.
    McKnight, Ursula S.
    WATER, 2021, 13 (21)
  • [36] Application of Data-Driven and Optimization Methods in Identification of Location and Quantity of Pollutants
    Khorsandi, Mostafa
    Haddad, Omid Bozorg
    Marino, Miguel A.
    JOURNAL OF HAZARDOUS TOXIC AND RADIOACTIVE WASTE, 2015, 19 (02)
  • [37] Is Open Source the Future of AI? A Data-Driven Approach
    Vake, Domen
    Sinik, Bogdan
    Vicic, Jernej
    Tosic, Aleksandar
    APPLIED SCIENCES-BASEL, 2025, 15 (05):
  • [38] Data-driven pollution source location algorithm in water quality monitoring sensor networks
    Yan, Xuesong
    Hu, Chengyu
    Sheng, Victor S.
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2020, 15 (03) : 171 - 180
  • [39] Data-Driven Diagnostics of Mechanism and Source of Sustained Oscillations
    Wang, Xiaozhe
    Turitsyn, Konstantin
    2016 IEEE POWER AND ENERGY SOCIETY GENERAL MEETING (PESGM), 2016,
  • [40] Data-Driven Diagnostics of Mechanism and Source of Sustained Oscillations
    Wang, Xiaozhe
    Turitsyn, Konstantin
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2016, 31 (05) : 4036 - 4046