MISMATCH: Fine-grained Evaluation of Machine-generated Text with Mismatch Error Types

被引:0
|
作者
Murugesan, Keerthiram [1 ]
Swaminathan, Sarathkrishna [1 ]
Dan, Soham [1 ]
Chaudhury, Subhajit [1 ]
Gunasekara, Chulaka [1 ]
Crouse, Maxwell [1 ]
Mahajan, Diwakar [1 ]
Abdelaziz, Ibrahim [1 ]
Fokoue, Achille [1 ]
Kapanipathi, Pavan [1 ]
Roukos, Salim [1 ]
Gray, Alexander [1 ]
机构
[1] IBM Res, New York, NY 10598 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the growing interest in large language models, the need for evaluating the quality of machine text compared to reference (typically human-generated) text has become focal attention. Most recent works focus either on task-specific evaluation metrics or study the properties of machine-generated text captured by the existing metrics. In this work, we propose a new evaluation scheme to model human judgments in 7 NLP tasks, based on the fine-grained mismatches between a pair of texts. Inspired by the recent efforts in several NLP tasks for fine-grained evaluation, we introduce a set of 13 mismatch error types such as spatial/geographic errors, entity errors, etc, to guide the model for better prediction of human judgments. We propose a neural framework for evaluating machine texts that uses these mismatch error types as auxiliary tasks and re-purposes the existing single-number evaluation metrics as additional scalar features, in addition to textual features extracted from the machine and reference texts. Our experiments reveal key insights about the existing metrics via the mismatch errors. We show that the mismatch errors between the sentence pairs on the held-out datasets from 7 NLP tasks align well with the human evaluation.
引用
收藏
页码:4485 / 4503
页数:19
相关论文
共 50 条
  • [31] Fine-grained Pseudo Labels for Scene Text Recognition
    Li, Xiaoyu
    Chen, Xiaoxue
    Huang, Zuming
    Xie, Lele
    Chen, Jingdong
    Yang, Ming
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5786 - 5795
  • [32] Fine-Grained Language Identification in Scene Text Images
    Li, Yongrui
    Wu, Shilian
    Yu, Jun
    Wang, Zengfu
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4573 - 4581
  • [33] Fine-Grained Text Classification Based on Label Augmentation
    Guo, Ruiqiang
    Yang, Shilong
    Jia, Xiaowen
    Wei, Qianqiang
    Computer Engineering and Applications, 60 (21): : 134 - 141
  • [34] Knowledge Mining with Scene Text for Fine-Grained Recognition
    Wang, Hao
    Liao, Junchao
    Cheng, Tianheng
    Gao, Zewen
    Liu, Hao
    Ren, Bo
    Bai, Xiang
    Liu, Wenyu
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4614 - 4623
  • [35] Simple Framework for Interpretable Fine-Grained Text Classification
    Battogtokh, Munkhtulga
    Luck, Michael
    Davidescu, Cosmin
    Borgo, Rita
    ARTIFICIAL INTELLIGENCE-ECAI 2023 INTERNATIONAL WORKSHOPS, PT 1, XAI3, TACTIFUL, XI-ML, SEDAMI, RAAIT, AI4S, HYDRA, AI4AI, 2023, 2024, 1947 : 398 - 425
  • [36] Properties of fine-grained steels generated by displacive transformation
    Bhadeshia, H. K. D. H.
    MATERIALS SCIENCE AND ENGINEERING A-STRUCTURAL MATERIALS PROPERTIES MICROSTRUCTURE AND PROCESSING, 2008, 481 : 36 - 39
  • [37] Text-Based Fine-Grained Emotion Prediction
    Singh, Gargi
    Brahma, Dhanajit
    Rai, Piyush
    Modi, Ashutosh
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (02) : 405 - 416
  • [38] Short Text Entity Linking with Fine-grained Topics
    Chen, Lihan
    Liang, Jiaqing
    Xie, Chenhao
    Xiao, Yanghua
    CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 457 - 466
  • [39] A fine-grained approach to scene text script identification
    Gomez, Lluis
    Karatzas, Dimosthenis
    PROCEEDINGS OF 12TH IAPR WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS, (DAS 2016), 2016, : 192 - 197
  • [40] Con-Text: Text Detection for Fine-Grained Object Classification
    Karaoglu, Sezer
    Tao, Ran
    van Gemert, Jan C.
    Gevers, Theo
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (08) : 3965 - 3980