An Empirical, Quantitative Analysis of the Differences Between Sarcasm and Irony

被引:20
|
作者
Ling, Jennifer [1 ]
Klinger, Roman [1 ]
机构
[1] Univ Stuttgart, Inst Maschinelle Sprachverarbeitung, Pfaffenwaldring 5b, D-70569 Stuttgart, Germany
来源
SEMANTIC WEB, ESWC 2016 | 2016年 / 9989卷
关键词
PRETENSE THEORY;
D O I
10.1007/978-3-319-47602-5_39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A variety of classification approaches for the detection of ironic or sarcastic messages has been proposed in the last decade to improve sentiment classification. However, despite the availability of psychologically and linguistically motivated theories regarding the difference between irony and sarcasm, these typically do not carry over to a use in predictive models; one reason might be that these concepts are often considered very similar. In this paper, we contribute an empirical analysis of Tweets and how authors label them as irony or sarcasm. We use this distantly labeled corpus to estimate a model to distinguish between39 both classes of figurative language with the aim to, ultimately, improve the semantically correct interpretation of opinionated statements. Our model separates irony from sarcasm with 79% accuracy on a balanced set. This result suggests that the task is harder than separating irony or sarcasm from regular texts with 89% and 90% accuracy, respectively. A feature analysis shows that ironic Tweets have on average a lower number of sentences than sarcastic Tweets. Sarcastic Tweets contain more positive words than ironic Tweets. Sarcastic Tweets are more often messages to a specific recipient than ironic Tweets. The analysis of bag-of-words features suggests that the comparably high classification performance to distinguish irony from sarcasm is supported by specific, reoccurring topics.
引用
收藏
页码:203 / 216
页数:14
相关论文
共 50 条