Text detection and recognition in images and video frames

被引:151
作者
Chen, DT [1 ]
Odobez, JM [1 ]
Bourlard, H [1 ]
机构
[1] Dalle Molle Inst Perceptual Artificial Intelligen, CH-1920 Martigny, Switzerland
关键词
text localization; text segmentation; text recognition; SVM; MRF; video OCR;
D O I
10.1016/j.patcog.2003.06.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new method for detecting and recognizing text in complex images and video frames. Text detection is performed in a two-step approach that combines the speed of a text localization step, enabling text size normalization, with the strength of a machine learning text verification step applied on background independent features. Text recognition, applied on the detected text lines, is addressed by a text segmentation step followed by an traditional OCR algorithm within a multi-hypotheses framework relying on multiple segments, language modeling and OCR statistics. Experiments conducted on large databases of real broadcast documents demonstrate the validity of our approach. (C) 2003 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:595 / 608
页数:14
相关论文
共 30 条
[1]  
[Anonymous], P ACM INT C DIG LIB
[3]   IMAGE-RESTORATION USING AN ESTIMATED MARKOV MODEL [J].
CHALMOND, B .
SIGNAL PROCESSING, 1988, 15 (02) :115-129
[4]  
CHEN D, 2002, IDIAPRR02
[5]  
Chen DT, 2001, PROC CVPR IEEE, P621
[6]   Text enhancement with asymmetric filter for video OCR [J].
Chen, DT ;
Shearer, K ;
Bourlard, H .
11TH INTERNATIONAL CONFERENCE ON IMAGE ANALYSIS AND PROCESSING, PROCEEDINGS, 2001, :192-197
[7]   A parallel mixture of SVMs for very large scale problems [J].
Collobert, R ;
Bengio, S ;
Bengio, Y .
NEURAL COMPUTATION, 2002, 14 (05) :1105-1114
[8]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[9]   FAST ALGORITHMS FOR THE DISCRETE COSINE TRANSFORM [J].
FEIG, E ;
WINOGRAD, S .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1992, 40 (09) :2174-2193
[10]  
Garcia C, 2000, INT CONF ACOUST SPEE, P2326, DOI 10.1109/ICASSP.2000.859306