The Causal News Corpus: Annotating Causal Relations in Event Sentences from News

被引:0
|
作者
Tan, Fiona Anting [1 ]
Hurriyetoglu, Ali [2 ]
Caselli, Tommaso [3 ]
Oostdijk, Nelleke [4 ]
Nomoto, Tadashi [5 ]
Hettiarachchi, Hansi [6 ]
Ameer, Iqra [7 ]
Uca, Onur [8 ]
Liza, Farhana Ferdousi [9 ]
Hu, Tiancheng [10 ]
机构
[1] Natl Univ Singapore, Inst Data Sci, Singapore, Singapore
[2] Koc Univ, Istanbul, Turkey
[3] Univ Groningen, Groningen, Netherlands
[4] Radboud Univ Nijmegen, Nijmegen, Netherlands
[5] Natl Inst Japanese Literature, Tokyo, Japan
[6] Birmingham City Univ, Birmingham, W Midlands, England
[7] Inst Politecn Nacl, Ctr Invest Comp, Mexico City, DF, Mexico
[8] Mersin Univ, Dept Sociol, Mersin, Turkey
[9] Univ East Anglia, Norwich, Norfolk, England
[10] Swiss Fed Inst Technol, Zurich, Switzerland
基金
新加坡国家研究基金会;
关键词
causality; event causality; text mining; natural language understanding;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Despite the importance of understanding causality, corpora addressing causal relations are limited. There is a discrepancy between existing annotation guidelines of event causality and conventional causality corpora that focus more on linguistics. Many guidelines restrict themselves to include only explicit relations or clause-based arguments. Therefore, we propose an annotation schema for event causality that addresses these concerns. We annotated 3,559 event sentences from protest event news with labels on whether it contains causal relations or not. Our corpus is known as the Causal News Corpus (CNC). A neural network built upon a state-of-the-art pre-trained language model performed well with 81.20% F1 score on test set, and 83.46% in 5-folds cross-validation. CNC is transferable across two external corpora: CausalTimeBank (CTB) and Penn Discourse Treebank (PDTB). Leveraging each of these external datasets for training, we achieved up to approximately 64% F1 on the CNC test set without additional fine-tuning. CNC also served as an effective training and pre-training dataset for the two external corpora. Lastly, we demonstrate the difficulty of our task to the layman in a crowd-sourced annotation exercise. Our annotated corpus is publicly available, providing a valuable resource for causal text mining researchers.
引用
收藏
页码:2298 / 2310
页数:13
相关论文
共 50 条
  • [21] What if User Preferences Shifts: Causal Disentanglement for News Recommendation
    Miao, Yingzhi
    Chen, Zhiqiang
    Zhou, Fang
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2024, PT 2, 2025, 14851 : 496 - 506
  • [22] Extracting temporal and causal relations based on event networks
    Duc-Thuan Vo
    Al-Obeidat, Feras
    Bagheri, Ebrahim
    INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (06)
  • [23] Fake News Detection by Means of Uncertainty Weighted Causal Graphs
    Garrido-Merchan, Eduardo C.
    Puente, Cristina
    Palacios, Rafael
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2020, 2020, 12344 : 13 - 24
  • [24] CAUSAL SHIFTS IN NEWS REPORTING - ENGLISH VERSUS GREEK PRESS
    SIDIROPOULOU, M
    PERSPECTIVES-STUDIES IN TRANSLATOLOGY, 1995, (01): : 83 - 98
  • [25] The Circumstantial Event Ontology (CEO) and ECB plus /CEO: an Ontology and Corpus for Implicit Causal Relations between Events
    Segers, Roxane
    Caselli, Tommaso
    Vossen, Piek
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4585 - 4592
  • [26] From Citing Sentences to Causal Networks: The Causality Index
    Small, Henry
    18TH INTERNATIONAL CONFERENCE ON SCIENTOMETRICS & INFORMETRICS (ISSI2021), 2021, : 1039 - 1044
  • [27] CrudeOilNews: An Annotated Crude Oil News Corpus for Event Extraction
    Lee, Meisin
    Soon, Lay-Ki
    Siew, Eu-Gene
    Sugianto, Ly Fie
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 465 - 479
  • [28] A Causal View for Multi-Interest User Modeling in News Recommendation
    Yu, Mei
    Zhou, Xiaoxi
    Zhao, Mankun
    Xu, Tianyi
    Zhao, Yue
    Yu, Ruiguo
    Li, Xuewei
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 433 - 441
  • [29] Anonymity in sharing morally salient news: the causal role of the temporoparietal junction
    Cui, Fang
    Zhong, Yijia
    Feng, Chenghu
    Peng, Xiaozhe
    CEREBRAL CORTEX, 2023, 33 (09) : 5457 - 5468
  • [30] Conformity-Aware Debiased Neural News Recommendation with Causal Reasoning
    Bao, Ji-Min
    Zhang, Kun
    Wu, Lei
    Hong, Ri-Chang
    Wang, Meng
    Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (10): : 2333 - 2351