Collective Human Opinions in Semantic Textual Similarity

被引：0

作者：

Wang, Yuxia ^{[1
]}

Tao, Shimin ^{[2
]}

Xie, Ning

Yang, Hao

Baldwin, Timothy ^{[1
,3
]}

Verspoor, Karin ^{[1
,4
]}

机构：

[1] Univ Melbourne, Melbourne, Vic, Australia

[2] Huawei TSC, Beijing, Peoples R China

[3] MBZUAI, Abu Dhabi, U Arab Emirates

[4] RMIT Univ, Melbourne, Vic, Australia

来源：

TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS | 2023年 / 11卷

关键词：

D O I：

10.1162/tacl_a_00584

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Despite the subjective nature of semantic textual similarity (STS) and pervasive disagreements in STS annotation, existing benchmarks have used averaged human ratings as gold standard. Averaging masks the true distribution of human opinions on examples of low agreement, and prevents models from capturing the semantic vagueness that the individual ratings represent. In this work, we introduce USTS, the first Uncertainty-aware STS dataset with & SIM;15,000 Chinese sentence pairs and 150,000 labels, to study collective human opinions in STS. Analysis reveals that neither a scalar nor a single Gaussian fits a set of observed judgments adequately. We further show that current STS models cannot capture the variance caused by human disagreement on individual instances, but rather reflect the predictive confidence over the aggregate dataset.

引用

页码：997 / 1013

页数：17

共 50 条

[1] Influence of Token Similarity Measures for Semantic Textual Similarity
Sowmya, V.
Vardhan, Vishnu B.
Raju, Bhadri M. S. V. S.
2016 IEEE 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (IACC), 2016, : 41 - 44
[2] FlexSTS: A Framework for Semantic Textual Similarity
Freire, Janio
Pinheiro, Vadia
Feitosa, David
LINGUAMATICA, 2016, 8 (02): : 23 - 31
[3] Semantic Textual Similarity in Bengali Text
Shajalal, Md
Aono, Masaki
2018 INTERNATIONAL CONFERENCE ON BANGLA SPEECH AND LANGUAGE PROCESSING (ICBSLP), 2018,
[4] Turkish Dataset for Semantic Textual Similarity
Fikri, Figen Beken
Oflazer, Kemal
Yanikoglu, Berrin
29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
[5] Semantic Textual Similarity in Quality Estimation
Bechara, Hanna
Parra Escartin, Carla
Orasan, Constantin
Specia, Lucia
BALTIC JOURNAL OF MODERN COMPUTING, 2016, 4 (02): : 256 - 268
[6] Linguistically Conditioned Semantic Textual Similarity
Tu, Jingxuan
Xu, Keer
Yue, Liulu
Ye, Bingyang
Rim, Kyeongmin
Pustejovsky, James
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1161 - 1172
[7] Correlation Coefficients and Semantic Textual Similarity
Zhelezniak, Vitalii
Savkov, Aleksandar
Shen, April
Hammerla, Nils Y.
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 951 - 962
[8] Czech Dataset for Semantic Textual Similarity
Svoboda, Lukas
Brychcin, Tomas
TEXT, SPEECH, AND DIALOGUE (TSD 2018), 2018, 11107 : 213 - 221
[9] Semantic Textual Similarity of Sentences with Emojis
Debnath, Alok
Pinnaparaju, Nikhil
Shrivastava, Manish
Varma, Vasudeva
Augenstein, Isabelle
WWW'20: COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2020, 2020, : 426 - 430
[10] Prologue Evaluation of Semantic Similarity and Textual Inference
Fonseca, Erick
Santos, Leandro
Criscuolo, Marcelo
Aluisio, Sandra
LINGUAMATICA, 2016, 8 (02): : IX - IX

← 1 2 3 4 5 →