Using the Web to create dynamic dictionaries in handwritten out-of-vocabulary word recognition

被引:6
|
作者
Oprean, Cristina [1 ]
Likforman-Sulem, Laurence [1 ]
Popescu, Adrian [2 ]
Mokbel, Chafic [3 ]
机构
[1] Telecom ParisTech, Inst Mines Telecom, 46 Rue Barrault, F-75013 Paris, France
[2] CEA, LIST, LVIC, F-91190 Gif Sur Yvette, France
[3] Univ Balamand, Fac Engn, Tripoli, Libya
关键词
D O I
10.1109/ICDAR.2013.199
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Handwriting recognition systems rely on predefined dictionaries obtained from training data. Small and static dictionaries are usually exploited to obtain high in-vocabulary (IV) accuracy at the expense of coverage. Thus the recognition of out-of-vocabulary (OOV) words cannot be handled efficiently. To improve OOV recognition while keeping IV dictionaries small, we introduce a multi-step approach that exploits Web resources. After an initial IV-OOV sequence classification, external resources are used to create OOV sequence-adapted dynamic dictionaries. A final Viterbi-based decoding is performed over the dynamic dictionary to determine the most probable word for the OOV sequence. We validate our approach with experiments conducted on RIMES, a publicly available database. Results show that improvements are obtained compared to standard handwriting recognition, performed with a static dictionary. Both domain-adapted and generic dynamic dictionaries are studied and we show that domain adaptation is beneficial.
引用
收藏
页码:989 / 993
页数:5
相关论文
共 50 条
  • [11] FastContext: Handling Out-of-Vocabulary Words Using the Word Structure and Context
    Silva, Renato M.
    Lochter, Johannes, V
    Almeida, Tiago A.
    Yamakami, Akebo
    INTELLIGENT SYSTEMS, PT II, 2022, 13654 : 539 - 557
  • [12] USING SYNTACTIC AND CONFUSION NETWORK STRUCTURE FOR OUT-OF-VOCABULARY WORD DETECTION
    Marin, Alex
    Kwiatkowski, Tom
    Ostendorf, Mari
    Zettlemoyer, Luke
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 159 - 164
  • [13] Incorporate web search technology to solve out-of-vocabulary words in Chinese word segmentation
    Qiao, Wei
    Sun, Maosong
    PACLIC 23 - Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, 2009, 2 : 454 - 463
  • [14] Handling Out-Of-Vocabulary Problem in Hangeul Word Embeddings
    Kwon, Ohjoon
    Kim, Dohyun
    Lee, Soo-Ryeon
    Choi, Junyoung
    Lee, SangKeun
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 3213 - 3221
  • [15] Chinese Word Segmentation and Out-Of-Vocabulary Words Detection Using Suffix Array
    Ji Wenyan
    Peng Tao
    Zuo Wanli
    He Fengling
    Zhu Huifeng
    WISM: 2009 INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND MINING, PROCEEDINGS, 2009, : 56 - 60
  • [16] Improving the Performance of Out-of-vocabulary Word Rejection by Using Support Vector Machines
    Huang Shilei
    Xie Xiang
    Kuang Jingming
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1618 - 1621
  • [17] Local methods for on-demand out-of-vocabulary word retrieval
    Oger, Stanislas
    Linares, Georges
    Bechet, Frederic
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 767 - 771
  • [18] A category based approach for recognition of out-of-vocabulary words
    Gallwitz, F
    Noth, E
    Niemann, H
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 228 - 231
  • [19] Inkball Models for Character Localization and Out-of-Vocabulary Word Spotting
    Howe, Nicholas R.
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 381 - 385
  • [20] Recurrent out-of-vocabulary word detection based on distribution of features
    Asami, Taichi
    Masumura, Ryo
    Aono, Yushi
    Shinoda, Koichi
    COMPUTER SPEECH AND LANGUAGE, 2019, 58 : 247 - 259