Comparison of various approaches to tagging for the inflectional Slovak language

被引:0
|
作者
Benko L. [1 ]
Munkova D. [1 ]
Pappová M. [1 ]
Munk M. [1 ,2 ]
机构
[1] Department of Computer Science, Constantine the Philosopher University in Nitra, Nitra
[2] Science and Research Centre, University of Pardubice, Pardubice
关键词
Automatic taggers; Low-resource language; Morhological annotation; Part-of-speech tagging; Slovak language;
D O I
10.7717/PEERJ-CS.2026
中图分类号
学科分类号
摘要
Morphological tagging provides essential insights into grammar, structure, and the mutual relationships of words within the sentence. Tagging text in a highly inflectional language presents a challenging task due to word ambiguity. This research aims to compare six different automatic taggers for the inflectional Slovak language, seeking for the most accurate tagger for literary and non-literary texts. Our results indicate that it is useful to differentiate texts into literary and non-literary and subsequently, based on the text style to deploy a tagger. For literary texts, UDPipe2 outperformed others in seven out of nine examined tagset positions. Conversely, for non-literary texts, the RNNTagger exhibited the highest performance in eight out of nine examined tagset positions. The RNNTagger is recommended for both types of the text, the best captures the inflection of the Slovak language, but UDPipe2 demonstrates a higher accuracy for literary texts. Despite dataset size limitations, this study emphasizes the suitability of various taggers for the inflectional languages like Slovak. © Copyright 2024 Benko et al.
引用
收藏
页码:1 / 31
页数:30
相关论文
共 50 条
  • [31] COMPARISON OF VARIOUS TYPES OF APPROACHES TO THE MINEABILITY OF ROCK
    Kubecka, Karel
    Bednarova, Petra
    Durak, Jan
    Vondrackova, Terezie
    Zasterova, Petra
    SCIENCE AND TECHNOLOGIES IN GEOLOGY, EXPLORATION AND MINING, SGEM 2015, VOL II, 2015, : 187 - 193
  • [32] A comparison of tagging methods and their tagging space
    Ke, XY
    Miretti, MM
    Broxholme, J
    Hunt, S
    Beck, S
    Bentley, DR
    Deloukas, P
    Cardon, LR
    HUMAN MOLECULAR GENETICS, 2005, 14 (18) : 2757 - 2767
  • [33] SOURCES OF THE SLOVAK LANGUAGE (ESSAYS) - SLOVAK - KRAJCOVIC,R
    不详
    HISTORICKY CASOPIS, 1981, 29 (05): : 746 - 747
  • [34] FORM-ORIENTED INFLECTIONAL ERRORS IN LANGUAGE PROCESSING
    STEMBERGER, JP
    MACWHINNEY, B
    COGNITIVE PSYCHOLOGY, 1986, 18 (03) : 329 - 354
  • [35] HMM based Named Entity Recognition for Inflectional Language
    Patil, Nita V.
    Patil, Ajay S.
    Pawar, B. V.
    2017 INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATIONS AND ELECTRONICS (COMPTELIX), 2017, : 565 - 572
  • [36] SLOVAK AND THE HISTORY OF LANGUAGE RELATIONSHIPS - SLOVAK - DORULA,J
    PETRO, P
    SLAVIC AND EAST EUROPEAN JOURNAL, 1979, 23 (01): : 144 - 145
  • [37] A CONCISE DICTIONARY OF THE SLOVAK LANGUAGE - SLOVAK - KACALA,J
    VRAGAS, S
    WIENER SLAVISTISCHES JAHRBUCH, 1988, 34 : 218 - 219
  • [38] Interesting Linguistic Features in Coreference Annotation of an Inflectional Language
    Ogrodniczuk, Maciej
    Glowinska, Katarzyna
    Kopec, Mateusz
    Savary, Agata
    Zawislawska, Magdalena
    CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, 2013, 8208 : 97 - 108
  • [39] Comprehensive Evaluation of Word Embeddings for Highly Inflectional Language
    Drozda, Pawel
    Sopyla, Krzysztof
    Lewalski, Juliusz
    ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2021), 2021, 1463 : 597 - 607
  • [40] Product and Process Analysis of Machine Translation into the Inflectional Language
    Munkova, Dasa
    Munk, Michal
    Welnitzova, Katarina
    Jakabovicova, Johanna
    SAGE OPEN, 2021, 11 (04):