Video text recognition using sequential Monte Carlo and error voting methods

被引:32
|
作者
Chen, DT [1 ]
Odobez, JM [1 ]
机构
[1] IDIAP Res Inst, CH-1920 Martigny, Valais, Switzerland
关键词
video text recognition; text segmentation; sequential Monte-Carlo filter; language model; recognition output voting error reduction;
D O I
10.1016/j.patrec.2004.11.019
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the issue of segmentation and recognition of text embedded in video sequences from their associated text image sequence extracted by a text detection module. To this end, we propose a probabilistic algorithm based on Bayesian adaptive thresholding and Monte-Carlo sampling. The algorithm approximates the posterior distribution of segmentation thresholds of text pixels in an image by a set of weighted samples. The set of samples is initialized by applying a classical segmentation algorithm on the first video frame and further refined by random sampling under a temporal Bayesian framework. One important contribution of the paper is to show that, thanks to the proposed methodology, the likelihood of a segmentation parameter sample can be estimated not using a classification criterion or a visual quality criterion based on the produced segmentation map, but directly from the induced text recognition result, which is directly relevant to our task. Furthermore, as a second contribution of the paper, we propose to align text recognition results from high confidence samples gathered over time, to composite a final result using error voting technique (ROVER) at the character level. Experiments are conducted on a two hour video database. Character recognition rates higher than 93%, and word error rates higher than 90% are achieved, which are 4% and 3% more than state-of-the-art methods applied to the same database. (c) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:1386 / 1403
页数:18
相关论文
共 50 条
  • [21] Multisensor fusion for target tracking using sequential Monte Carlo methods
    Vemula, Mahesh
    Djuric, Petar M.
    2005 IEEE/SP 13th Workshop on Statistical Signal Processing (SSP), Vols 1 and 2, 2005, : 1223 - 1227
  • [22] Tracking variable number of targets using sequential Monte Carlo methods
    Ng, William
    Li, Jack
    Godsill, Simon
    Vermaak, Jaco
    2005 IEEE/SP 13TH WORKSHOP ON STATISTICAL SIGNAL PROCESSING (SSP), VOLS 1 AND 2, 2005, : 1207 - 1211
  • [23] Blind speech dereverberation using batch and sequential Monte Carlo methods
    Evers, Christine
    Hopgood, James R.
    Bell, Judith
    PROCEEDINGS OF 2008 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-10, 2008, : 3226 - +
  • [24] Pulse pressure variation tracking using sequential Monte Carlo methods
    Kim, Sunghan
    Aboy, Mateo
    McNames, James
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2013, 8 (04) : 333 - 340
  • [25] Sequential Dynamic Leadership Inference Using Bayesian Monte Carlo Methods
    Li, Qing
    Ahmad, Bashar, I
    Godsill, Simon J.
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2021, 57 (04) : 2039 - 2052
  • [26] On multitarget track extraction and maintenance using sequential Monte Carlo methods
    Ulmke, M
    SIGNAL AND DATA PROCESSING OF SMALL TARGETS 2005, 2005, 5913
  • [27] Prediction in hidden Markov models using sequential Monte Carlo methods
    Zhang, Dongqing
    Ning, Xuanxi
    Liu, Xueni
    Ma, Hongwei
    PROCEEDINGS OF 2007 IEEE INTERNATIONAL CONFERENCE ON GREY SYSTEMS AND INTELLIGENT SERVICES, VOLS 1 AND 2, 2007, : 718 - 722
  • [28] A sequential Monte Carlo method for Bayesian face recognition
    Matsui, Atsushi
    Clippingdale, Simon
    Matsumoto, Takashi
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, PROCEEDINGS, 2006, 4109 : 578 - 586
  • [29] Error bounds for sequential Monte Carlo samplers for multimodal distributions
    Paulin, Daniel
    Jasra, Ajay
    Thiery, Alexandre
    BERNOULLI, 2019, 25 (01) : 310 - 340
  • [30] VLSI architecture of a wireless channel estimator using Sequential Monte Carlo methods
    Shabany, M
    Shojania, H
    Zhang, J
    Omidi, J
    Gulak, PG
    2005 IEEE 6TH WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS, 2005, : 450 - 454