How Many Data Does Machine Learning in Human-Computer Interaction Need?: Re-Estimating the Dataset Size for Convolutional Neural Network-Based Models of Visual Perception

被引:1
|
作者
Bakaev, Maxim
Heil, Sebastian [1 ]
Khvorostov, Vladimir
Gaedke, Martin [2 ]
机构
[1] Tech Univ Chemnitz, Distributed & Selforganizing Syst, D-09107 Chemnitz, Germany
[2] Tech Univ Chemnitz, Dept Comp Sci, D-09107 Chemnitz, Germany
基金
俄罗斯基础研究基金会;
关键词
Training; Training data; Estimation; Machine learning; Mean square error methods; Data models; Planning;
D O I
10.1109/MITP.2023.3262923
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Artificial intelligence (AI)-based user-interface (UI) design and evaluation are currently constrained by the scarcity of human-generated training data. Correspondingly, choosing appropriate neural network (NN) architecture and carefully planning the sample size is essential for building accurate machine learning models. Previously, we have estimated that for a convolutional NN to produce better mean square errors (MSEs) than feature-based models, the required training dataset size should be approximately 3000. Our current validation study with roughly 4000 web UIs and 233 subjects suggests that the estimation should be closer to 17,000. We propose corrected regression models suggesting that the dataset size effect is better described using a logarithmic function. We also report significant differences in MSEs between the employed perception dimensions, with Aesthetics models having an MSE 21.5% worse than Complexity and 12.1% worse than Orderliness.
引用
收藏
页码:23 / 29
页数:7
相关论文
empty
未找到相关数据