Multilingual Argument Mining: Datasets and Analysis

被引:0
|
作者
Toledo-Ronen, Orith [1 ]
Orbach, Matan [1 ]
Bilu, Yonatan [1 ]
Spector, Artem [1 ]
Slonim, Noam [1 ]
机构
[1] IBM Res, Cambridge, MA 02142 USA
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020 | 2020年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The growing interest in argument mining and computational argumentation brings with it a plethora of Natural Language Understanding (NLU) tasks and corresponding datasets. However, as with many other NLU tasks, the dominant language is English, with resources in other languages being few and far between. In this work, we explore the potential of transfer learning using the multilingual BERT model to address argument mining tasks in non-English languages, based on English datasets and the use of machine translation. We show that such methods are well suited for classifying the stance of arguments and detecting evidence, but less so for assessing the quality of arguments, presumably because quality is harder to preserve under translation. In addition, focusing on the translate-train approach, we show how the choice of languages for translation, and the relations among them, affect the accuracy of the resultant model. Finally, to facilitate evaluation of transfer learning on argument mining tasks, we provide a human-generated dataset with more than 10k arguments in multiple languages, as well as machine translation of the English datasets.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Multilingual datasets
    Weesie, J
    STATA JOURNAL, 2005, 5 (02): : 162 - 187
  • [2] A DATA-INFORMED ANALYSIS OF ARGUMENT MINING
    Cabrio, Elena
    JOURNAL OF APPLIED LOGICS-IFCOLOG JOURNAL OF LOGICS AND THEIR APPLICATIONS, 2022, 9 (04): : 757 - 785
  • [3] A Data-Informed Analysis of Argument Mining
    Cabrio, Elena
    Villata, Serena
    Journal of Applied Logics, 2022, 9 (04): : 757 - 785
  • [4] Some Facets of Argument Mining for Opinion Analysis
    Villalba, Maria Paz Garcia
    Saint-Dizier, Patrick
    COMPUTATIONAL MODELS OF ARGUMENT, 2012, 245 : 23 - 34
  • [5] MGAD: Multilingual Generation of Analogy Datasets
    Abdou, Mostafa
    Kulmizev, Artur
    Ravishankar, Vinit
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 2034 - 2039
  • [6] Multilingual text mining
    Neri, F
    Data Mining VI: Data Mining, Text Mining and Their Business Applications, 2005, : 89 - 94
  • [7] A Multilingual Datasets Repository of the Hadith Content
    Mahmood, Ahsan
    Alarfaj, Fawaz K.
    Khan, Hikmat Ullah
    Ramzan, Muhammad
    Ilyas, Mahwish
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (02) : 165 - 172
  • [8] Argument Structure Mining in Scientific Articles: A Comparative Analysis
    Song, Ningyuan
    Cheng, Hanghang
    Zhou, Huimin
    Wang, Xiaoguang
    2019 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL 2019), 2019, : 339 - 340
  • [9] Argument Mining Driven Analysis of Peer-Reviews
    Fromm, Michael
    Faerman, Evgeniy
    Berrendorf, Max
    Bhargava, Siddharth
    Qi, Ruoxia
    Zhang, Yao
    Dennert, Lukas
    Selle, Sophia
    Mao, Yang
    Seidl, Thomas
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 4758 - 4766
  • [10] Cataloging and mining massive datasets for science data analysis
    Fayyad, UM
    Smyth, P
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 1999, 8 (03) : 589 - 610