Two-step sequence transformer based method for Cham to Latin script transliteration

被引:0
|
作者
Tien-Nam Nguyen [1 ]
Burie, Jean-Christophe [1 ]
Thi-Lan Le [2 ]
Schweyer, Anne-Valerie [3 ]
机构
[1] Lab Informat Image Interact L3i, La Rochelle, France
[2] Sch Elect & Elect Engn SEEE, Hanoi, Vietnam
[3] CNRS, Ctr Asie Sud Est CASE, Paris, France
关键词
Transliteration; Historical documents; Cham manuscript images; Transformer; Sequence to Sequence;
D O I
10.1145/3604951.3605525
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fusion information between visual and textual information is an interesting way to better represent the features. In thiswork, we propose a method for the text line transliteration of Cham manuscripts by combining visual and textual modality. Instead of using a standard approach that directly recognizes the words in the image, we split the problem into two steps. Firstly, we propose a scenario for recognition where similar characters are considered as unique characters, then we use the transformer model which considers both visual and context information to adjust the prediction when it concerns similar characters to be able to distinguish them. Based on this two-step strategy, the proposed method consists of a sequence to sequence model and a multi-modal transformer. Thus, we can take advantage of both the sequence-to-sequence model and the transformer model. Extensive experiments prove that the proposed method outperforms the approaches of the literature on our Cham manuscripts dataset.
引用
收藏
页码:25 / 30
页数:6
相关论文
共 50 条
  • [1] A two-step method for out-of-sequence measurements
    Lanzkron, PJ
    Bar-Shalom, Y
    2004 IEEE AEROSPACE CONFERENCE PROCEEDINGS, VOLS 1-6, 2004, : 2036 - 2041
  • [2] Algorithm to Avoid Overlapping Vowel Signs in Latin to Balinese Script Transliteration Method
    Andika, I. Gede
    Yanti, Christina Purnama
    Indrawan, Gede
    PROCEEDINGS OF 2019 5TH INTERNATIONAL CONFERENCE ON NEW MEDIA STUDIES (CONMEDIA 2019), 2019, : 48 - 53
  • [3] Two-step method for calculation of eddy current losses in a laminated transformer core
    Frljic, Stjepan
    Trkulja, Bojan
    IET ELECTRIC POWER APPLICATIONS, 2020, 14 (09) : 1577 - 1583
  • [4] A New Method of Latin-to-Balinese Script Transliteration based on Noto Sans Balinese Font and Dictionary Data Structure
    Indrawan, G.
    Paramarta, I. K.
    Agustini, K.
    PROCEEDINGS OF THE 2019 2ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND INFORMATION MANAGEMENT (ICSIM 2019) / 2019 2ND INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (ICBDSC 2019), 2019, : 75 - 79
  • [5] A Two-Step Disentanglement Method
    Hadad, Naama
    Wolf, Lior
    Shahar, Moni
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 772 - 780
  • [6] Harmonic Balance Based on Two-Step Galerkin Method
    Bizzarri, Federico
    Brambilla, Angelo
    Codecasa, Lorenzo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2016, 63 (09) : 1476 - 1486
  • [7] Service Discovery Method Based on Two-step Clustering
    He Jia-jing
    Wang Jin-dong
    Wang Na
    Niu Kan
    PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 220 - 224
  • [8] Pavement Pothole Extraction Based on YOLOX-Transformer Two-step Model
    Wang A.-D.
    Peng Y.-C.
    Lang H.
    Xing Y.-Y.
    Chen Z.
    Lu J.
    Zhongguo Gonglu Xuebao/China Journal of Highway and Transport, 2023, 36 (12): : 304 - 317
  • [9] Two-Step Method for the Calculation of Eddy Current Losses in an Open-Core Transformer
    Frljic, Stjepan
    Trkulja, Bojan
    IEEE TRANSACTIONS ON MAGNETICS, 2021, 57 (03)
  • [10] A two-step reaction sequence for the syntheses of tetrahydronaphthalenes
    Hilt, G
    Luers, S
    Smolko, KI
    ORGANIC LETTERS, 2005, 7 (02) : 251 - 253