A Dataset of Offensive German Language Tweets Annotated for Speech Acts

被引：0

作者：

Plakidis, Melina ^{[1
,2
]}

Rehm, Georg ^{[1
,2
]}

机构：

[1] DFKI GmbH, Alt Moabit 91C, D-10559 Berlin, Germany

[2] Humboldt Univ, Dorotheenstr 24, D-10117 Berlin, Germany

来源：

LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2022年

关键词：

Speech acts; hate speech detection; offensive language; annotation; corpus annotation;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

We present a dataset consisting of German offensive and non-offensive tweets, annotated for speech acts. These 600 tweets are a subset of the dataset by (Stru beta et al., 2019) and comprises three levels of annotation, i. e., six coarse-grained speech acts, 23 fine-grained speech acts and 14 different sentence types. Furthermore, we provide an evaluation in both qualitative and quantitative terms. The dataset is made publicly available under a CC-BY-4.0 license.

引用

页码：4799 / 4807

页数：9

共 50 条

[41] Expert-Annotated Dataset to Study Cyberbullying in Polish Language
Ptaszynski, Michal
Pieciukiewicz, Agata
Dybala, Pawel
Skrzek, Pawel
Soliwoda, Kamil
Fortuna, Marcin
Leliwa, Gniewosz
Wroczynski, Michal
DATA, 2024, 9 (01)
[42] Language and therapeutic change:: A speech acts analysis
Reyes, Lucia
Aristegui, Roberto
Krause, Mariane
Strasser, Katherine
Tomicic, Alemka
Valdes, Nelson
Altimir, Carolina
Ramirez, Ivonne
De La Parra, Guillermo
Dagnino, Paula
Echavarri, Orietta
Vilches, Oriana
Ben-Dov, Perla
PSYCHOTHERAPY RESEARCH, 2008, 18 (03) : 355 - 362
[43] Tonsawang Language Speech Acts in Traditional Medicine
Rorong, Ferdy Dj
Lensun, Sherly
Sompotan, Amelia Gladys
Pandi, Helena
Sambeka, Fince Leny
Aror, Susanti
PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON SOCIAL SCIENCES (ICSS 2018), 2018, 226 : 903 - 907
[44] LINGUISTIC SIGNS AND SPEECH ACTS - GERMAN - NEHRING,A
BUYSSENS, E
PHONETICA, 1965, 12 (02) : 122 - 124
[45] LINGUISTIC SIGNS AND SPEECH ACTS - GERMAN - NEHRING,A
ANTAL, L
LINGUISTICS, 1965, (14) : 76 - 89
[46] A Multi-Platform Arabic News Comment Dataset for Offensive Language Detection
Chowdhury, Shammur A.
Mubarak, Hamdy
Abdelali, Ahmed
Jung, Soon-gyo
Jansen, Bernard J.
Salminen, Joni
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6203 - 6212
[47] Overview of the HASOC Track at FIRE 2020: Hate Speech and Offensive Language Identification in Tamil, Malayalam, Hindi, English and German
Mandl, Thomas
Modha, Sandip
Kumar, Anand M.
Chakravarthi, Bharathi Raja
PROCEEDINGS OF THE 12TH ANNUAL MEETING OF THE FORUM FOR INFORMATION RETRIEVAL EVALUATION (FIRE 2020), 2020, : 29 - 32
[48] On the Impact ofWord Representation in Hate Speech and Offensive Language Detection and Explanation
Hu, Ruijia
Dorris, Wyatt
Vishwamitra, Nishant
Luo, Feng
Costello, Matthew
PROCEEDINGS OF THE TENTH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, CODASPY 2020, 2020, : 171 - 173
[49] An Automatic Approach for the Identification of Offensive Language in Perso-Arabic Urdu Language: Dataset Creation and Evaluation
Din, Salah Ud
Khusro, Shah
Khan, Farman Ali
Ahmad, Munir
Ali, Oualid
Ghazal, Taher M.
IEEE ACCESS, 2025, 13 : 19755 - 19769
[50] "Archiving the Haystack": Archiving Initiative for German-language Tweets
Schlesinger, Claus-michael
Woldering, Britta
ZEITSCHRIFT FUR BIBLIOTHEKSWESEN UND BIBLIOGRAPHIE, 2024, 71 (04): : 236 - 242

← 1 2 3 4 5 →