Automatic extraction of table metadata from digital documents

被引:0
|
作者
Liu, Ying [1 ]
Mitra, Prasenjit [1 ]
Giles, C. Lee [1 ]
Bai, Kun [1 ]
机构
[1] Penn State Univ, Coll Informat Sci & Technol, University Pk, PA 16802 USA
关键词
metadata extraction; table detection; table structure recognition; searching; exchanging;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tables are used to present, list, summarize, and structure important data in documents. In scholarly articles, they are often used to present the relationships among data and highlight a collection of results obtained from experiments and scientific analysis. In digital libraries, extracting this data automatically and understanding the structure and content of tables are very important to many applications. Automatic identification extraction, and search for the contents of tables can be made more precise with the help of metadata. In this paper, we propose a set of medium-independent table metadata to facilitate the table indexing, searching, and exchanging. To extract the contents of tables and their metadata, an automatic table metadata extraction algorithm is designed and rested on PDF documents.
引用
收藏
页码:339 / +
页数:2
相关论文
共 50 条
  • [1] TableSeer: Automatic Table Metadata Extraction and Searching in Digital Libraries
    Liu, Ying
    Bai, Kun
    Mitra, Prasenjit
    Giles, C. Lee
    PROCEEDINGS OF THE 7TH ACM/IEE JOINT CONFERENCE ON DIGITAL LIBRARIES: BUILDING & SUSTAINING THE DIGITAL ENVIRONMENT, 2007, : 91 - 100
  • [2] Figure Metadata Extraction From Digital Documents
    Choudhury, Sagnik Ray
    Mitra, Prasenjit
    Kirk, Andi
    Szep, Silvia
    Pellegrino, Donald
    Jones, Sue
    Giles, C. Lee.
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 135 - 139
  • [3] Enrichment of data in digital documents with metadata extraction
    Júnior C.D.S.
    Dorneles C.F.
    International Journal of Metadata, Semantics and Ontologies, 2023, 16 (02) : 187 - 193
  • [4] Workflow of Metadata Extraction from Retro-Born Digital Documents
    Tkaczyk, Dominika
    Bolikowski, Lukasz
    DML 2011: TOWARDS A DIGITAL MATHEMATICS LIBRARY, 2011, : 39 - 44
  • [5] Metadata extraction from office documents
    Stumbo, WK
    Handley, JC
    Archiving 2005, Final Program and Proceedings, 2005, : 184 - 187
  • [6] Automatic selection of table areas in documents for information extraction
    Silva, ACE
    Jorge, A
    Torgo, L
    PROGRESS IN ARTIFICIAL INTELLIGENCE-B, 2003, 2902 : 460 - 465
  • [7] DeshengNet : An Information Extraction Model for Table in Digital Documents
    Hu, Xiangben
    Jiang, Jielin
    Hu, Zhichen
    Huang, Tao
    Xue, Shengjun
    Xu, Xiaolong
    2021 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS DASC/PICOM/CBDCOM/CYBERSCITECH 2021, 2021, : 567 - 573
  • [8] Table Interpretation and Extraction of Semantic Relationships to Synthesize Digital Documents
    Perez-Arriaga, Martha O.
    Estrada, Trilce
    Abad-Mota, Soraya
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA), 2017, : 223 - 232
  • [9] Automatic extraction of metadata from learning objects
    Miranda, Sergio
    Ritrovato, Pierluigi
    2014 INTERNATIONAL CONFERENCE ON INTELLIGENT NETWORKING AND COLLABORATIVE SYSTEMS (INCOS), 2014, : 704 - 709
  • [10] Automatic knowledge extraction from documents
    Fan, J.
    Kalyanpur, A.
    Gondek, D. C.
    Ferrucci, D. A.
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2012, 56 (3-4)