PUB: A Pragmatics Understanding Benchmark for Assessing LLMs' Pragmatics Capabilities

被引:0
|
作者
Sravanthi, Settaluri Lakshmi [1 ]
Doshi, Meet [1 ]
Kalyan, Tankala Pavan [1 ]
Murthy, Rudra [2 ]
Dabre, Raj [3 ]
Bhattacharyya, Pushpak [1 ]
机构
[1] Indian Inst Technol, CFILT, Mumbai, Maharashtra, India
[2] IBM Res, Armonk, NY USA
[3] NICT, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
LLMs have demonstrated remarkable capability for understanding semantics, but their understanding of pragmatics is not well studied. To this end, we release a Pragmatics Understanding Benchmark (PUB) dataset consisting of fourteen tasks in four pragmatics phenomena, namely, Implicature, Presupposition, Reference, and Deixis. We curate high-quality test sets for each task, consisting of Multiple Choice Question Answers (MCQA). PUB includes a total of 28k data points, 6.1k are newly annotated. We evaluate nine models varying in the number of parameters and type of training. Our study reveals several key observations about the pragmatic capabilities of LLMs: 1. chat-fine-tuning strongly benefits smaller models, 2. large base models are competitive with their chat-fine-tuned counterparts, 3. there is a huge variance in performance across different pragmatics phenomena, and 4. a noticeable performance gap between human capabilities and model capabilities. We hope that PUB will enable comprehensive evaluation of LLM's pragmatic reasoning capabilities.
引用
收藏
页码:12075 / 12097
页数:23
相关论文
共 50 条
  • [31] The Pragmatic Perspective of Jef Verschueren——A Comment on Understanding Pragmatics
    司艳辉
    金月
    海外英语, 2017, (06) : 192 - 193
  • [32] DEVELOPMENT OF GROUP UNDERSTANDING - FROM PERCEPTION TO PRAGMATICS AND PROCEDURES
    MCWHIRTER, L
    GAMBLE, R
    BULLETIN OF THE BRITISH PSYCHOLOGICAL SOCIETY, 1981, 34 (JUN): : 244 - 245
  • [33] Cyber Information Retrieval Through Pragmatics Understanding and Visualization
    Sun, Nan
    Zhang, Jun
    Gao, Shang
    Zhang, Leo Yu
    Camtepe, Seyit
    Xiang, Yang
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2023, 20 (02) : 1186 - 1199
  • [34] PRAGMATICS AND NATURAL-LANGUAGE UNDERSTANDING - GREEN,GM
    WARD, GL
    LANGUAGE, 1991, 67 (02) : 345 - 347
  • [35] PRAGMATICS AND NATURAL-LANGUAGE UNDERSTANDING - GREEN,G
    CLARK, B
    LINGUISTICS, 1992, 30 (02) : 389 - 397
  • [36] PRAGMATICS AND NATURAL-LANGUAGE UNDERSTANDING - GREEN,GM
    LAMBERT, B
    QUARTERLY JOURNAL OF SPEECH, 1991, 77 (03) : 382 - 383
  • [37] How Semantics and Pragmatics Interact in Understanding Conceptual Models
    Bera, Palash
    Burton-Jones, Andrew
    Wand, Yair
    INFORMATION SYSTEMS RESEARCH, 2014, 25 (02) : 401 - 419
  • [38] Evaluating LLMs Capabilities Towards Understanding Social Dynamics
    Tahir, Anique
    Cheng, Lu
    Sandoval, Manuel
    Silva, Yasin N.
    Hall, Deborah L.
    Liu, Huan
    SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2024, PT III, 2025, 15213 : 230 - 244
  • [39] Contrastive Pragmatics and Second Language (L2) Pragmatics: Approaches to Assessing L2 Speech Act Production
    Taguchi, Naoko
    Li, Shuai
    CONTRASTIVE PRAGMATICS, 2021, 2 (01): : 1 - 23
  • [40] ASSESSING DIFFERENT ASPECTS OF SOCIAL COMMUNICATION AND PRAGMATICS THROUGH AN APP
    Andres-Roqueta, Clara
    Flores, Raquel
    Belen Gorriz, Ana
    Benedito, Irene
    Soria-Izquierdo, Eloy
    Emilio Adrian, Juan
    Ramos, Francisco
    INTED2016: 10TH INTERNATIONAL TECHNOLOGY, EDUCATION AND DEVELOPMENT CONFERENCE, 2016, : 4832 - 4837