Predicting Difficulty and Discrimination of Natural Language Questions

被引:0
|
作者
Byrd, Matthew A. [1 ]
Srivastava, Shashank [1 ]
机构
[1] Univ North Carolina Chapel Hill, Chapel Hill, NC 27599 USA
关键词
ITEM RESPONSE THEORY; READABILITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Item Response Theory (IRT) has been extensively used to numerically characterize question difficulty and discrimination for human subjects in domains including cognitive psychology and education (Primi et al., 2014; Downing, 2003). More recently, IRT has been used to similarly characterize item difficulty and discrimination for natural language models across various datasets (Lalor et al., 2019; Vania et al., 2021; Rodriguez et al., 2021). In this work, we explore predictive models for directly estimating and explaining these traits for natural language questions in a question-answering context. We use HotpotQA for illustration. Our experiments show that it is possible to predict both difficulty and discrimination parameters for new questions, and these traits are correlated with features of questions, answers, and associated contexts. Our findings can have significant implications for the creation of new datasets and tests on the one hand and strategies such as active learning and curriculum learning on the other.
引用
收藏
页码:119 / 130
页数:12
相关论文
共 50 条
  • [31] Natural Language Questions and Answers for RDF Information Resources
    Akita, Chie
    Mase, Motohiro
    Kitamura, Yasuhiko
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2010, 14 (04) : 384 - 389
  • [32] MEASURING THE DIFFICULTY OF REFERENCE QUESTIONS
    CHILDERS, T
    LOPATA, C
    STAFFORD, B
    RQ, 1991, 31 (02): : 237 - 243
  • [33] Predicting implicit attitudes with natural language data
    Bhatia, Sudeep
    Walasek, Lukasz
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 120 (25)
  • [34] Predicting Item Difficulty in a Language Test with an Adaptive Neuro Fuzzy Inference System
    Aryadoust, Vahid
    PROCEEDINGS OF THE 2013 IEEE WORKSHOP ON HYBRID INTELLIGENT MODELS AND APPLICATIONS (HIMA), 2013, : 43 - 50
  • [35] Relations of the Number of Functioning Distractors With the Item Difficulty Index and the Item Discrimination Power in the Multiple Choice Questions
    Chauhan, Girish R.
    Chauhan, Bhoomika R.
    Vaza, Jayesh V.
    Chauhan, Pradip R.
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (07)
  • [36] PREDICTING DIFFICULTY OF LARYNGOSCOPY
    WILLIAMSON, R
    ANAESTHESIA AND INTENSIVE CARE, 1993, 21 (06) : 896 - 897
  • [37] DISCRIMINATION TEST WORD DIFFICULTY
    CAMPBELL, RA
    JOURNAL OF SPEECH AND HEARING RESEARCH, 1965, 8 (01): : 13 - 22
  • [38] Testing the limits of natural language models for predicting human language judgements
    Golan, Tal
    Siegelman, Matthew
    Kriegeskorte, Nikolaus
    Baldassano, Christopher
    NATURE MACHINE INTELLIGENCE, 2023, 5 (09) : 952 - +
  • [39] Testing the limits of natural language models for predicting human language judgements
    Tal Golan
    Matthew Siegelman
    Nikolaus Kriegeskorte
    Christopher Baldassano
    Nature Machine Intelligence, 2023, 5 : 952 - 964
  • [40] PREDICTING PARENTING DIFFICULTY
    PITTMAN, JF
    WRIGHT, CA
    LLOYD, SA
    JOURNAL OF FAMILY ISSUES, 1989, 10 (02) : 267 - 286