Universal Metadata Repository for Document Analysis and Recognition

被引:0
|
作者
Al-Barhamtoshy, H. [1 ]
Khemakhem, M. [1 ]
Jambi, K. [1 ]
Essa, F. [1 ]
Fattouh, A. [1 ]
Al-Ghamdi, A. [1 ]
机构
[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Jeddah, Saudi Arabia
来源
2016 IEEE/ACS 13TH INTERNATIONAL CONFERENCE OF COMPUTER SYSTEMS AND APPLICATIONS (AICCSA) | 2016年
关键词
document analysis and recognition; dataset; metadata; repositor; ICDAR; 2015; COMPETITION;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Document Analysis and Recognition (DAR) has two main objectives, first the analysis of the physical structure of the input image of the document, which should lead to the correct identification of the corresponding different homogeneous components and their boundaries in terms of XY coordinates. Second, each of these homogeneous components should be recognized in such a way that, if it is a text image, consequently this image should be recognized and translated into an intelligible text. DAR remains one of the most challenging topics in pattern recognition. Indeed, despite the diversity of the proposed approaches, techniques and methods, results remain very weak and away from expectations especially for several categories of documents such as complex, low quality, handwritten and historical documents. The complex structure and/or morphology of such documents are behind the weakness of results of these proposed approaches, techniques and methods. One of the challenging problems related to this topic is the creation of standard datasets that can be used by all stakeholders of this topic such as system developers, expert evaluators, and users. In addition, another challenging problem is how one could take advantages of all existing datasets that unfortunately are dispersed around the world without knowing, most of the times, any information about their locations and the way to reach them. As an attempt to solve the two mentioned above problems, we propose in this paper a Universal Datasets Repository for Document Analysis and Recognition (UMDAR) that has, in fact, a twofold advantage. First, it can help dataset creators to standardize their datasets and making them accessible to the research community once published on the proposed repository. Second, it can be used as a central which bridges in a smart manner between datasets and all DAR stakeholders.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Metadata inference for document retrieval in a distributed repository
    Rigaux, P
    Spyratos, N
    ADVANCES IN COMPUTER SCIENCE - ASIAN 2004, PROCEEDINGS, 2004, 3321 : 418 - 436
  • [2] Metadata inference for document retrieval in a distributed repository
    Rigaux, P.
    Spyratos, N.
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3321 : 418 - 436
  • [3] Universal Metadata Repository: Integrating Data Profiles Across an Organization
    Abolhassani, Neda
    Ramaswamy, Lakshmish
    Puri, Colin
    Parthasarathy, Sonali
    Wang, Zhijie
    Kujawinski, Matthew
    Tung, Teresa
    2018 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2018, : 452 - 459
  • [4] Model&Metamodel, Metadata and Document Repository for Software and Data Integration
    Milanovic, Nikola
    Kutsche, Ralf
    Baum, Timo
    Cartsburg, Mario
    Elmasguenes, Hatice
    Pohl, Marco
    Widiker, Juergen
    MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS, PROCEEDINGS, 2008, 5301 : 416 - 430
  • [5] Repository Metadata: Approaches and Challenges
    Chapman, John W.
    Reynolds, David
    Shreeves, Sarah A.
    CATALOGING & CLASSIFICATION QUARTERLY, 2009, 47 (3-4) : 309 - 325
  • [6] A digital metadata schema repository
    Lin, Yen-Chun
    Wang, Hsiang-An
    Huang, Chien-Chung
    Chen, Wei
    NEW ASPECTS OF TELECOMMUNICATIONS AND INFORMATICS, 2008, : 177 - 182
  • [7] Revisiting an Analysis of Agricultural Learning Repository Metadata: Preliminary Results
    Manolis, Nikos
    Kastrantas, Kostas
    Manouselis, Nikos
    METADATA AND SEMANTICS RESEARCH, 2012, 343 : 325 - 335
  • [8] Document analysis and recognition
    Watanabe, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1999, E82D (03): : 601 - 610
  • [9] Metadata Repository System Based on MOF
    Pan, Yunshan
    Xiao, Wenbin
    Ye, Shiqi
    2011 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION AND INDUSTRIAL APPLICATION (ICIA2011), VOL III, 2011, : 113 - 116
  • [10] Metadata Repository System Based on MOF
    Pan, Yunshan
    Xiao, Wenbin
    Ye, Shiqi
    2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL VIII, 2010, : 113 - 116