A comparison of sequential and combined approaches for named entity recognition in a corpus of handwritten medieval charters

被引:15
|
作者
Boros, Emanuela [1 ]
Romero, Veronica [2 ]
Maarand, Martin [1 ]
Zenklova, Katerina [3 ]
Kreckova, Jitka [3 ]
Vidal, Enrique [2 ]
Stutzmann, Dominique [4 ]
Kermorvant, Christopher [1 ]
机构
[1] TEKLIA, Paris, France
[2] Univ Politecn Valencia, PRHLT Res Ctr, Valencia, Spain
[3] Narodni Arch, Prague, Czech Republic
[4] IRHT CNRS, Paris, France
关键词
Named entity recognition; Handwritten Text Recognition; historical document processing; multilingualism;
D O I
10.1109/ICFHR2020.2020.00025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a new corpus of multilingual medieval handwritten charter images, annotated with full transcription and named entities. The corpus is used to compare two approaches for named entity recognition in historical document images in several languages: on the one hand, a sequential approach, more commonly used, that sequentially applies handwritten text recognition (HTR) and named entity recognition (NER), on the other hand, a combined approach that simultaneously transcribes the image text line and extracts the entities. Experiments conducted on the charter corpus in Latin, early new high German and old Czech for name, date and location recognition demonstrate a superior performance of the combined approach.
引用
收藏
页码:79 / 84
页数:6
相关论文
共 50 条
  • [41] Comparison of named entity recognition methodologies in biomedical documents
    Hye-Jeong Song
    Byeong-Cheol Jo
    Chan-Young Park
    Jong-Dae Kim
    Yu-Seop Kim
    BioMedical Engineering OnLine, 17
  • [42] Efficient combined approach for named entity recognition in spoken language
    Zidouni, Azeddine
    Rosset, Sophie
    Glotin, Herve
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1293 - +
  • [43] Distantly Supervised Named Entity Recognition Combined with Prototypical Networks
    Luo S.
    Lin Z.
    Pan L.
    Wu Z.
    Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2023, 43 (04): : 410 - 416
  • [44] Named entity recognition in medical domain combined with knowledge graph
    Jin Z.
    He X.
    Yue S.
    Xiong Y.
    Luo J.
    Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2023, 55 (05): : 50 - 58
  • [45] A Named Entity Recognition Corpus for Vietnamese Biomedical Texts to Support Tuberculosis Treatment
    Phan, Uyen T. P.
    Nguyen, Phuong N. V.
    Nguyen, Nhung T. H.
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3601 - 3609
  • [46] UlyssesNER-Br: A Corpus of Brazilian Legislative Documents for Named Entity Recognition
    Albuquerque, Hidelberg O.
    Costa, Rosimeire
    Silvestre, Gabriel
    Souza, Ellen
    da Silva, Nadia F. F.
    Vitorio, Douglas
    Moriyama, Gyovana
    Martins, Lucas
    Soezima, Luiza
    Nunes, Augusto
    Siqueira, Felipe
    Tarrega, Joao P.
    Beinotti, Joao, V
    Dias, Marcio
    Silva, Matheus
    Gardini, Miguel
    Silva, Vinicius
    de Carvalho, Andre C. P. L. F.
    Oliveira, Adriano L., I
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2022, 2022, 13208 : 3 - 14
  • [47] Named Entity Recognition Modeling for the Thai Language from a Disjointedly Labeled Corpus
    Suriyachay, Kitiya
    Sornlertlamvanich, Virach
    2018 5TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATICS: CONCEPTS, THEORY AND APPLICATIONS (ICAICTA 2018), 2018, : 30 - 35
  • [48] DrugSemantics: A corpus for Named Entity. Recognition in Spanish Summaries of Product Characteristics
    Moreno, Isabel
    Boldrini, Ester
    Moreda, Paloma
    Teresa Roma-Ferri, M.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 72 : 8 - 22
  • [49] GraphNER: Using Corpus Level Similarities and Graph Propagation for Named Entity Recognition
    Sheikhshab, Golnar
    Starks, Elizabeth
    Karsan, Aly
    Chiu, Readman
    Sarkar, Anoop
    Birol, Inanc
    2018 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2018), 2018, : 229 - 238
  • [50] Building the ArabNER Corpus for Arabic Named Entity Recognition Using ChatGPT and Bard
    Mahdhaoui, Hassen
    Mars, Abdelkarim
    Zrigui, Mounir
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT I, ACIIDS 2024, 2024, 14795 : 159 - 170