PUB: A Pragmatics Understanding Benchmark for Assessing LLMs' Pragmatics Capabilities

被引:0
|
作者
Sravanthi, Settaluri Lakshmi [1 ]
Doshi, Meet [1 ]
Kalyan, Tankala Pavan [1 ]
Murthy, Rudra [2 ]
Dabre, Raj [3 ]
Bhattacharyya, Pushpak [1 ]
机构
[1] Indian Inst Technol, CFILT, Mumbai, Maharashtra, India
[2] IBM Res, Armonk, NY USA
[3] NICT, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
LLMs have demonstrated remarkable capability for understanding semantics, but their understanding of pragmatics is not well studied. To this end, we release a Pragmatics Understanding Benchmark (PUB) dataset consisting of fourteen tasks in four pragmatics phenomena, namely, Implicature, Presupposition, Reference, and Deixis. We curate high-quality test sets for each task, consisting of Multiple Choice Question Answers (MCQA). PUB includes a total of 28k data points, 6.1k are newly annotated. We evaluate nine models varying in the number of parameters and type of training. Our study reveals several key observations about the pragmatic capabilities of LLMs: 1. chat-fine-tuning strongly benefits smaller models, 2. large base models are competitive with their chat-fine-tuned counterparts, 3. there is a huge variance in performance across different pragmatics phenomena, and 4. a noticeable performance gap between human capabilities and model capabilities. We hope that PUB will enable comprehensive evaluation of LLM's pragmatic reasoning capabilities.
引用
收藏
页码:12075 / 12097
页数:23
相关论文
共 50 条
  • [41] A Pragmatics Web Service Oriented Approach to Understanding the Semantics of Concepts
    Zhai Sheping
    Wang Hai
    Wei Juanli
    2008 IEEE ASIA-PACIFIC SERVICES COMPUTING CONFERENCE, VOLS 1-3, PROCEEDINGS, 2008, : 1069 - +
  • [42] A Pragmatics-Centered Evaluation Framework for Natural Language Understanding
    Sileo, Damien
    De Cruys, Tim Van
    Pradel, Camille
    Muller, Philippe
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 2382 - 2394
  • [43] Assessing Instruments and their Application in Empirical Studies of Inter language Pragmatics in China
    Chen Youlin
    Gao Xiaofang
    PROCEEDINGS OF THE THIRD NORTHEAST ASIA INTERNATIONAL SYMPOSIUM ON LANGUAGE, LITERATURE AND TRANSLATION, VOLS 1 AND 2, 2014, : 456 - 462
  • [44] ConceptPsy: A comprehensive benchmark suite for hierarchical psychological concept understanding in LLMs
    Zhang, Junlei
    He, Hongliang
    Ma, Lizhi
    Song, Nirui
    He, Shuyuan
    Zhang, Shuai
    Qiu, Huachuan
    Zhou, Zhanchao
    Li, Anqi
    Dai, Yong
    Xu, Renjun
    Lan, Zhenzhong
    NEUROCOMPUTING, 2025, 637
  • [45] UNDERSTANDING PRAGMATICS IN SECOND-LANGUAGE LEARNING A FEW LINGUISTIC PRELIMINARIES
    Torricelli, Patrizia
    RETI SAPERI LINGUAGGI-ITALIAN JOURNAL OF COGNITIVE SCIENCES, 2018, 5 (01): : 203 - 208
  • [46] THE EXPRESSIVITY OF TURN-TAKING: UNDERSTANDING CHILDREN PRAGMATICS BY HYBRID CLASSIFIERS
    Segalin, Cristina
    Pesarin, Anna
    Vinciarelli, Alessandro
    Tait, Monja
    Cristani, Marco
    2013 14TH INTERNATIONAL WORKSHOP ON IMAGE ANALYSIS FOR MULTIMEDIA INTERACTIVE SERVICES (WIAMIS), 2013,
  • [47] The hidden contribution of developmental pragmatics in understanding adjustment and behaviour difficulties in children
    Carpentier, Tania
    Desbiens, Nadia
    ENFANCE, 2022, (04) : 501 - 519
  • [48] Assessing pragmatics in early childhood with the Language Use Inventory across seven languages
    Pesco, Diane
    O'Neill, Daniela K.
    FRONTIERS IN PSYCHOLOGY, 2023, 14
  • [49] PRAGMATICS AND NATURAL-LANGUAGE UNDERSTANDING - GREEN,GM, NORMAN,DD, ORTONY,A
    MUELLERLUST, RAG
    AMERICAN JOURNAL OF PSYCHOLOGY, 1990, 103 (02): : 281 - 284
  • [50] Psychological Pragmatics in Preadolescents: Sociomoral Understanding, Self-Worth, and School Behavior
    Sandra Leanne Bosacki
    Journal of Youth and Adolescence, 2003, 32 : 141 - 155