MISMATCH: Fine-grained Evaluation of Machine-generated Text with Mismatch Error Types

被引:0
|
作者
Murugesan, Keerthiram [1 ]
Swaminathan, Sarathkrishna [1 ]
Dan, Soham [1 ]
Chaudhury, Subhajit [1 ]
Gunasekara, Chulaka [1 ]
Crouse, Maxwell [1 ]
Mahajan, Diwakar [1 ]
Abdelaziz, Ibrahim [1 ]
Fokoue, Achille [1 ]
Kapanipathi, Pavan [1 ]
Roukos, Salim [1 ]
Gray, Alexander [1 ]
机构
[1] IBM Res, New York, NY 10598 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the growing interest in large language models, the need for evaluating the quality of machine text compared to reference (typically human-generated) text has become focal attention. Most recent works focus either on task-specific evaluation metrics or study the properties of machine-generated text captured by the existing metrics. In this work, we propose a new evaluation scheme to model human judgments in 7 NLP tasks, based on the fine-grained mismatches between a pair of texts. Inspired by the recent efforts in several NLP tasks for fine-grained evaluation, we introduce a set of 13 mismatch error types such as spatial/geographic errors, entity errors, etc, to guide the model for better prediction of human judgments. We propose a neural framework for evaluating machine texts that uses these mismatch error types as auxiliary tasks and re-purposes the existing single-number evaluation metrics as additional scalar features, in addition to textual features extracted from the machine and reference texts. Our experiments reveal key insights about the existing metrics via the mismatch errors. We show that the mismatch errors between the sentence pairs on the held-out datasets from 7 NLP tasks align well with the human evaluation.
引用
收藏
页码:4485 / 4503
页数:19
相关论文
共 50 条
  • [21] On the Zero-Shot Generalization of Machine-Generated Text Detectors
    Pu, Xiao
    Zhang, Jingyu
    Han, Xiaochuang
    Tsvetkov, Yulia
    He, Tianxing
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 4799 - 4808
  • [22] RoFT: A Tool for Evaluating Human Detection of Machine-Generated Text
    Dugan, Liam
    Ippolito, Daphne
    Kirubarajan, Arun
    Callison-Burch, Chris
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING: SYSTEM DEMONSTRATIONS, 2020, : 189 - 196
  • [23] Fine-grained concrete with various types of fibers
    Begich, Y. E.
    Klyuev, S., V
    Jos, V. A.
    Cherkashin, A., V
    MAGAZINE OF CIVIL ENGINEERING, 2020, 97 (05):
  • [24] Fine-Grained Geolocalization of User-Generated Short Text Based on Weight Probability Model
    Gao, Congjie
    Li, Yongjun
    Yang, Jiaqi
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2089 - 2092
  • [25] Fine-Grained Geolocalization of User-Generated Short Text Based on a Weight Probability Model
    Gao, Congjie
    Li, Yongjun
    Yang, Jiaqi
    Dong, Wei
    IEEE ACCESS, 2019, 7 : 153579 - 153591
  • [26] Fine-grained and coarse-grained contrastive learning for text classification
    Zhang, Shaokang
    Ran, Ning
    NEUROCOMPUTING, 2024, 596
  • [27] Saturation evaluation for fine-grained sediments
    Zhu, Linqi
    Wu, Shiguo
    Zhou, Xueqing
    Cai, Jianchao
    GEOSCIENCE FRONTIERS, 2023, 14 (04)
  • [28] Fine-Grained Machine Teaching with Attention Modeling
    Liu, Jiacheng
    Hou, Xiaofeng
    Tang, Feilong
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 2585 - 2592
  • [29] Fine-Grained Evaluation for Entity Linking
    Rosales-Mendez, Henry
    Hogan, Aidan
    Poblete, Barbara
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 718 - 727
  • [30] Enhancing Machine-Generated Text Detection: Adversarial Fine-Tuning of Pre-Trained Language Models
    Hee Lee, Dong
    Jang, Beakcheol
    IEEE ACCESS, 2024, 12 : 65333 - 65340