Domain-aware evaluation of named entity recognition systems for Croatian

被引:0
|
作者
机构
[1] Agić, Željko
[2] Bekavac, Božo
来源
Agić, Z. (zagic@ffzg.hr) | 1600年 / University of Zagreb Faculty of Electrical Engineering and Computing卷 / 21期
关键词
Natural language processing systems;
D O I
10.2498/cit.1002190
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We provide an evaluation of the currently available named entity recognition systems for Croatian. The evaluation puts special emphasis on domain dependence. To this goal, we manually annotated a dataset of approximately 1 million tokens of Croatian text from various domains within the newspaper text genre. The dataset was annotated using a three-class named entity tagset - denoting personal names, locations and organizations. We give insight to feature selection, domain sensitivity and effects of increase in training set size for statistical named entity recognition using the state-of-the-art Stanford NER system. We also sketch a comparison of publicly available named entity recognition systems for Croatian considering domain dependence, regardless of their underlying paradigms. Our top-performing system achieved an F1-score of 0.884 in a mixed-domain testing scenario, scoring 0.925 and 0.843 in the two domains separated for the experiment. The system shows consistency in state-of-the-art scores for detecting names of persons, locations and organizations.
引用
收藏
相关论文
共 50 条
  • [1] An Entity-Aware Adversarial Domain Adaptation Network for Cross-Domain Named Entity Recognition (Student Abstract)
    Peng, Qi
    Zheng, Changmeng
    Cai, Yi
    Wang, Tao
    Xie, Haoran
    Li, Qing
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 15865 - 15866
  • [2] Unsupervised cross-domain named entity recognition using entity-aware adversarial training
    Peng, Qi
    Zheng, Changmeng
    Cai, Yi
    Wang, Tao
    Xie, Haoran
    Li, Qing
    NEURAL NETWORKS, 2021, 138 (138) : 68 - 77
  • [3] Multi-domain evaluation framework for named entity recognition tools
    Abdallah, Zahraa S.
    Carman, Mark
    Haffari, Gholamreza
    COMPUTER SPEECH AND LANGUAGE, 2017, 43 : 34 - 55
  • [4] SlugNERDS: A Named Entity Recognition Tool for Open Domain dialogue Systems
    Bowden, Kevin K.
    Wu, Jiaqi
    Oraby, Shereen
    Misra, Amita
    Walker, Marilyn
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4462 - 4469
  • [5] LearningToAdapt with word embeddings: Domain adaptation of Named Entity Recognition systems
    Nozza, Debora
    Manchanda, Pikakshi
    Fersini, Elisabetta
    Palmonari, Matteo
    Messina, Enza
    INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (03)
  • [6] LearningToAdapt with word embeddings: Domain adaptation of Named Entity Recognition systems
    Nozza, Debora
    Manchanda, Pikakshi
    Fersini, Elisabetta
    Palmonari, Matteo
    Messina, Enza
    Information Processing and Management, 2021, 58 (03):
  • [7] Named Entity Recognition in the Domain of Geographical Subject
    Xu, Feifei
    Li, Huiying
    Li, Xuelian
    2017 13TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2017, : 2229 - 2234
  • [8] DOMAIN-AWARE NEURAL LANGUAGE MODELS FOR SPEECH RECOGNITION
    Liu, Linda
    Gu, Yile
    Gourav, Aditya
    Gandhe, Ankur
    Kalmane, Shashank
    Filimonov, Denis
    Rastrow, Ariya
    Bulyko, Ivan
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7373 - 7377
  • [9] Named Entity Recognition System for the Biomedical Domain
    Sharma, Raghav
    Chauhan, Deependra
    Sharma, Raksha
    PROCEEDINGS OF THE 2022 17TH CONFERENCE ON COMPUTER SCIENCE AND INTELLIGENCE SYSTEMS (FEDCSIS), 2022, : 837 - 840
  • [10] A framework for Named Entity Recognition in the Open domain
    Evans, RJ
    RECENT ADVANCES IN NATURAL LANGUAGE PROCESSING III, 2004, 260 : 267 - 276