A Dataset of Offensive German Language Tweets Annotated for Speech Acts

被引：0

作者：

Plakidis, Melina ^{[1
,2
]}

Rehm, Georg ^{[1
,2
]}

机构：

[1] DFKI GmbH, Alt Moabit 91C, D-10559 Berlin, Germany

[2] Humboldt Univ, Dorotheenstr 24, D-10117 Berlin, Germany

来源：

LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2022年

关键词：

Speech acts; hate speech detection; offensive language; annotation; corpus annotation;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

We present a dataset consisting of German offensive and non-offensive tweets, annotated for speech acts. These 600 tweets are a subset of the dataset by (Stru beta et al., 2019) and comprises three levels of annotation, i. e., six coarse-grained speech acts, 23 fine-grained speech acts and 14 different sentence types. Furthermore, we provide an evaluation in both qualitative and quantitative terms. The dataset is made publicly available under a CC-BY-4.0 license.

引用

页码：4799 / 4807

页数：9

共 50 条

[1] A Dataset for Investigating the Impact of Context for Offensive Language Detection in Tweets
Ihtiyar, Musa Nuri
Ozdemir, Omer
Erengul, Mustafa Emre
Ozgur, Arzucan
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 1543 - 1549
[2] CoRoSeOf - An Annotated Corpus of Romanian Sexist and Offensive Tweets
Hoefels, Diana Constantina
Coltekin, Cagri
Madroane, Irina Diana
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 2269 - 2281
[3] Annotated dataset of history-related tweets
Sumikawa, Yasunobu
Jatowt, Adam
DATA IN BRIEF, 2021, 38
[4] SOLD: Sinhala offensive language dataset
Ranasinghe, Tharindu
Anuradha, Isuri
Premasiri, Damith
Silva, Kanishka
Hettiarachchi, Hansi
Uyangodage, Lasitha
Zampieri, Marcos
LANGUAGE RESOURCES AND EVALUATION, 2025, 59 (01) : 297 - 337
[5] CLASSIFICATION OF SPEECH ACTS WITH FUTURE SEMANTICS (IN THE GERMAN LANGUAGE)
Bodnaruk, E. V.
VESTNIK SANKT-PETERBURGSKOGO UNIVERSITETA-YAZYK I LITERATURA, 2015, (02): : 62 - 75
[6] A Dataset of Offensive Language in Kosovo Social Media
Ajvazi, Adem
Hardmeier, Christian
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1860 - 1869
[7] HateBR: A Large Expert Annotated Corpus of Brazilian Instagram Comments for Offensive Language and Hate Speech Detection
Vargas, Francielle
Carvalho, Isabelle
Goes, Fabiana
Pardo, Thiago A. S.
Benevenuto, Fabricio
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 7174 - 7183
[8] LANGUAGE, SPEECH, AND SPEECH-ACTS
MACKINNO.E
PHILOSOPHY AND PHENOMENOLOGICAL RESEARCH, 1973, 34 (02) : 224 - 238
[9] LASTD: A Manually Annotated and Tested Large Arabic Sentiment Tweets Dataset
Elshakankery, Kariman
Fayek, Magda
Farouk, Mona
5TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND DATA MINING (ICISDM 2021), 2021, : 62 - 66
[10] From Insult to Hate Speech: Mapping Offensive Language in German User Comments on Immigration
Paasch-Colberg, Sunje
Strippel, Christian
Trebbe, Joachim
Emmer, Martin
MEDIA AND COMMUNICATION, 2021, 9 (01): : 171 - 180

← 1 2 3 4 5 →