Automated document content characterization for a multimedia document retrieval system

被引:0
|
作者
Koivusaari, M
Sauvola, J
Pietikainen, M
机构
关键词
document layout analysis; predictive coding; document database; retrieval; document content characterization; object-oriented database;
D O I
10.1117/12.290337
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
We propose a new approach to automate document image layout extraction for an object-oriented database feature population using rapid low level feature analysis, preclassification and predictive coding. The layout information comprised of region location and classification data is transformed into 'feature object(s)'. The information is then fed into an intelligent document image retrieval system (IDIR) to be utilized in document retrieval schemes. The IDIR system consists of user interface, object-oriented database and a variety of document image analysis algorithms. In this paper the object-oriented storage model and the database system are presented in formal and functional domains. Moreover, the graphical user interface and a visual document image browser are described. The document analysis techniques used at document characterization are also presented. In this context the documents consist of text, picture and other media (possibly embedded) data. Documents are stored in the database as document, page and region objects. Our test system has been implemented and tested using a document database of 10 000 documents.
引用
收藏
页码:148 / 159
页数:12
相关论文
共 50 条
  • [21] MODEL FOR A DOCUMENT RETRIEVAL SYSTEM
    TAKAHAMA, T
    INFORMATION STORAGE AND RETRIEVAL, 1973, 9 (03): : 143 - 163
  • [22] XML document retrieval system supporting multimedia web service for digital museum
    Chang, Jae-Woo
    Kim, Young-jin
    2007 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, PROCEEDINGS, 2007, : 1001 - +
  • [23] A Document Image Retrieval System
    Zagoris, Konstantinos
    Ergina, Kavallieratou
    Papamarkos, Nikos
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2010, 23 (06) : 872 - 879
  • [24] Multimedia document retrieval using speech and speaker recognition
    Viswanathan M.
    Beigi H.S.M.
    Dharanipragada S.
    Maali F.
    Tritschler A.
    International Journal on Document Analysis and Recognition, 2000, 2 (04) : 147 - 162
  • [25] ZYX -: A multimedia document model for reuse and adaptation of multimedia content
    Boll, S
    Klas, W
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2001, 13 (03) : 361 - 382
  • [26] Visual Query Posing in Multimedia Web Document Retrieval
    Rinaldi, Antonio M.
    Russo, Cristiano
    Tommasino, Cristian
    2021 IEEE 15TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2021), 2021, : 415 - 420
  • [27] Combination of document structure and links for multimedia object retrieval
    Aouadi, Hatem
    Torjmen-Khemakhem, Mouna
    Ben Jemaa, Maher
    JOURNAL OF INFORMATION SCIENCE, 2012, 38 (05) : 442 - 458
  • [28] A stream relationship monitor for adaptive multimedia document retrieval
    da Cunha, EC
    Carmo, LFRD
    Pirmez, L
    GLOBECOM'99: SEAMLESS INTERCONNECTION FOR UNIVERSAL SERVICES, VOL 1-5, 1999, : 2071 - 2075
  • [29] Automated Content Suggestion for Document Writing
    Ashwin, N.
    Narang, Anish
    Das, Madhura
    Srinath, Ramamoorthy
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH, 2016, : 45 - 50
  • [30] Automated Semantic Query Formulation for Document Retrieval
    Kadir, Rabiah A.
    Yauri, Aliyu Rufai
    Azman, Azreen
    2018 FOURTH INTERNATIONAL CONFERENCE ON INFORMATION RETRIEVAL AND KNOWLEDGE MANAGEMENT (CAMP), 2018, : 124 - 131