Using the Web to create dynamic dictionaries in handwritten out-of-vocabulary word recognition

被引:6
|
作者
Oprean, Cristina [1 ]
Likforman-Sulem, Laurence [1 ]
Popescu, Adrian [2 ]
Mokbel, Chafic [3 ]
机构
[1] Telecom ParisTech, Inst Mines Telecom, 46 Rue Barrault, F-75013 Paris, France
[2] CEA, LIST, LVIC, F-91190 Gif Sur Yvette, France
[3] Univ Balamand, Fac Engn, Tripoli, Libya
关键词
D O I
10.1109/ICDAR.2013.199
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Handwriting recognition systems rely on predefined dictionaries obtained from training data. Small and static dictionaries are usually exploited to obtain high in-vocabulary (IV) accuracy at the expense of coverage. Thus the recognition of out-of-vocabulary (OOV) words cannot be handled efficiently. To improve OOV recognition while keeping IV dictionaries small, we introduce a multi-step approach that exploits Web resources. After an initial IV-OOV sequence classification, external resources are used to create OOV sequence-adapted dynamic dictionaries. A final Viterbi-based decoding is performed over the dynamic dictionary to determine the most probable word for the OOV sequence. We validate our approach with experiments conducted on RIMES, a publicly available database. Results show that improvements are obtained compared to standard handwriting recognition, performed with a static dictionary. Both domain-adapted and generic dynamic dictionaries are studied and we show that domain adaptation is beneficial.
引用
收藏
页码:989 / 993
页数:5
相关论文
共 50 条
  • [1] Dynamic out-of-vocabulary word registration to language model for speech recognition
    Norihide Kitaoka
    Bohan Chen
    Yuya Obashi
    EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [2] Dynamic out-of-vocabulary word registration to language model for speech recognition
    Kitaoka, Norihide
    Chen, Bohan
    Obashi, Yuya
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [3] Out-of-vocabulary word rejection algorithm in Korean variable vocabulary word recognition
    Moon, KS
    Kim, YJ
    Kim, HR
    Chung, JH
    ISCAS 2000: IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - PROCEEDINGS, VOL V: EMERGING TECHNOLOGIES FOR THE 21ST CENTURY, 2000, : 53 - 56
  • [4] Out-of-Vocabulary Word Detection and Beyond
    Kombrink, Stefan
    Hannemann, Mirko
    Burget, Lukas
    DETECTION AND IDENTIFICATION OF RARE AUDIOVISUAL CUES, 2012, 384 : 57 - 65
  • [5] Out-of-vocabulary word recognition with a hierarchical doubly Markov language model
    Kokubo, H
    Yamamoto, H
    Ogawa, Y
    Sagisaka, Y
    Kikui, G
    ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 543 - 547
  • [6] Recurrent Out-of-Vocabulary Word Detection Using Distribution of Features
    Asami, Taichi
    Masumura, Ryo
    Aono, Yushi
    Shinoda, Koichi
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1320 - 1324
  • [7] Out-of-vocabulary word modeling using multiple lexical fillers
    Boulianne, G
    Dumouchel, P
    ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 226 - 229
  • [8] Out-of-vocabulary word recognition using a hierarchical language model based on multiple Markov models
    Yamamoto, H
    Kokubo, H
    Kikui, G
    Ogawa, Y
    Sagisaka, Y
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2005, 88 (12): : 55 - 64
  • [9] SYSTEM COMBINATION FOR OUT-OF-VOCABULARY WORD DETECTION
    Qin, Long
    Sun, Ming
    Rudnicky, Alexander
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4817 - 4820
  • [10] Out of vocabulary word detection and recovery in Arabic handwritten text recognition
    Jemni, Sana Khamekhem
    Kessentini, Yousri
    Kanoun, Slim
    PATTERN RECOGNITION, 2019, 93 : 507 - 520