Collective Human Opinions in Semantic Textual Similarity

被引:0
|
作者
Wang, Yuxia [1 ]
Tao, Shimin [2 ]
Xie, Ning
Yang, Hao
Baldwin, Timothy [1 ,3 ]
Verspoor, Karin [1 ,4 ]
机构
[1] Univ Melbourne, Melbourne, Vic, Australia
[2] Huawei TSC, Beijing, Peoples R China
[3] MBZUAI, Abu Dhabi, U Arab Emirates
[4] RMIT Univ, Melbourne, Vic, Australia
关键词
D O I
10.1162/tacl_a_00584
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite the subjective nature of semantic textual similarity (STS) and pervasive disagreements in STS annotation, existing benchmarks have used averaged human ratings as gold standard. Averaging masks the true distribution of human opinions on examples of low agreement, and prevents models from capturing the semantic vagueness that the individual ratings represent. In this work, we introduce USTS, the first Uncertainty-aware STS dataset with & SIM;15,000 Chinese sentence pairs and 150,000 labels, to study collective human opinions in STS. Analysis reveals that neither a scalar nor a single Gaussian fits a set of observed judgments adequately. We further show that current STS models cannot capture the variance caused by human disagreement on individual instances, but rather reflect the predictive confidence over the aggregate dataset.
引用
收藏
页码:997 / 1013
页数:17
相关论文
共 50 条
  • [41] Mapping sentences to concept transferred space for semantic textual similarity
    Heyan Huang
    Hao Wu
    Xiaochi Wei
    Yang Gao
    Shumin Shi
    Knowledge and Information Systems, 2019, 60 : 1353 - 1376
  • [42] Advancing Knowledge Discoveries in Criminal Investigations with Semantic Textual Similarity
    Skipanes, Mads
    Jorgensen, Tollef Emil
    Franke, Katrin
    LEGAL KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 379 : 269 - 274
  • [43] Mapping sentences to concept transferred space for semantic textual similarity
    Huang, Heyan
    Wu, Hao
    Wei, Xiaochi
    Gao, Yang
    Shi, Shumin
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 60 (03) : 1353 - 1376
  • [44] Phrase-based Semantic Textual Similarity for Linking Researchers
    Reyes-Ortiz, Jose A.
    Bravo, Maricela
    Padilla, Omar E.
    2015 26TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2015, : 202 - 206
  • [45] A Predominant Statistical Approach to Identify Semantic Similarity of Textual Documents
    Vigneshvaran, P.
    Jayabalan, E.
    Vijaya, K.
    2013 INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, INFORMATICS AND MEDICAL ENGINEERING (PRIME), 2013,
  • [46] SUMEX: A hybrid framework for Semantic textUal siMilarity and EXplanation generation
    Saeed, Sumaira
    Rajput, Quratulain
    Haider, Sajjad
    INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (05)
  • [47] An overview of textual semantic similarity measures based on web intelligence
    Martinez-Gil, Jorge
    ARTIFICIAL INTELLIGENCE REVIEW, 2014, 42 (04) : 935 - 943
  • [48] An overview of textual semantic similarity measures based on web intelligence
    Jorge Martinez-Gil
    Artificial Intelligence Review, 2014, 42 : 935 - 943
  • [49] Word Embedding based Textual Semantic Similarity Measure in Bengali
    Iqbal, Md Asif
    Sharif, Omar
    Hoque, Mohammed Moshiul
    Sarker, Iqbal H.
    10TH INTERNATIONAL YOUNG SCIENTISTS CONFERENCE IN COMPUTATIONAL SCIENCE (YSC2021), 2021, 193 : 92 - 101