Predicting First-Language and Second-Language Proficiency Using Eye Fixation Data and Demographic Information: Assumptions, Data Representations, and Methods

被引:0
|
作者
Shalileh, Soroosh [1 ,2 ]
Kairov, Matvey [2 ]
Baminiwatte, Ranga [3 ]
Parshina, Olga [4 ]
Dragoy, Olga [1 ,5 ]
机构
[1] HSE Univ, Ctr Language & Brain, Moscow 101000, Russia
[2] HSE Univ, Lab Artificial Intelligence Cognit Sci, Moscow 101000, Russia
[3] Clemson Univ, Sch Comp, Clemson, SC 29634 USA
[4] Middlebury Coll, Psychol Dept, Middlebury, VT 05753 USA
[5] Russian Acad Sci, Inst Linguist, Moscow 125009, Russia
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Data models; Accuracy; Convolutional neural networks; Solid modeling; Predictive models; Prediction algorithms; Linguistics; Gaze tracking; Artificial intelligence; Natural language processing; Multi lingual; First-language; second-language proficiency; eye-tracking; applied artificial intelligence; CONVOLUTIONAL NEURAL-NETWORKS; LEARNERS; LEVEL;
D O I
10.1109/ACCESS.2024.3468460
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Studying first-language (L1), second-language (L2) acquisition, and bilingualism using eye movement data has become a popular topic in psycholinguistic and educational research communities. The current research uses eye fixation data along with demographic information, to investigate the five research questions (RQ) as follows. Q(1) Is it possible to predict L1 from the eye fixation data using artificial intelligence (AI) methods? Q(2) Is it possible to predict second-language proficiency (L2P) from eye-fixation data using AI methods? Q(3 )Which of the six L2P assessment batteries under consideration is more effective in predicting L2P? Q(4 )How informative is eye fixation data or its combination with demographic information in predicting L1 and L2P? Q(5 )How can eye fixation data be represented for training AI models in predicting L1 and L2P? We used the MECO L2 data set and scrutinized the performance of three families of AI methods. In respect to each RQ the results showed that 1) using only eye fixation data, it is possible to predict L1 with a ROC-AUC equal to 0.755; 2) using only eye fixation data, it is not possible to predict L2P accurately (since a R-2-score equal to 0.216 was obtained); 3) L2 Lexical Skills is the most effective L2P assessment battery; 4) combining the eye-fixation data with demographic features led to a significant improvement in the performance of the models, i.e., a ROC-AUC equal to 0.997 in predicting L1 and a R-2-score equal to 0.899 in predicting L2P were obtained, and simultaneously downgraded the impacts of eye-fixation parameters; 5) the 2D-scatter plot images can be considered an appropriate candidate for training AI models using only eye-fixation data-at least for predicting L1.
引用
收藏
页码:145832 / 145844
页数:13
相关论文
共 7 条