Managing very large document collections using semantics

被引:1
|
作者
Wang, GR [1 ]
Lu, HJ
Yu, G
Bao, YB
机构
[1] Northeastern Univ, Dept Comp Sci, Shenyang 110004, Peoples R China
[2] Hong Kong Univ Sci & Technol, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
关键词
semantic document; multidimensional exploring; document querying;
D O I
10.1007/BF02948912
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a system is presented where documents are no longer identified by their file names. Instead, a document is represented by its semantics in terms of descriptor and content vector. The descriptor of a document consists of a set of attributes, such as date of creation, its type, its size, annotations, etc. The content vector of a document consists of a set of terms extracted from the document. In this paper, a semantic document management system XBASE is designed and implemented based on the semantics and the functions of three main modules, X-Loader, X-Explorer and X-Query.
引用
收藏
页码:403 / 406
页数:4
相关论文
共 50 条
  • [1] Managing very large document collections using semantics
    GuoRen Wang
    HongJun Lu
    Ge Yu
    Bin YuBao
    Journal of Computer Science and Technology, 2003, 18 : 403 - 406
  • [2] Efficient clustering of very large document collections
    Dhillon, IS
    Fan, J
    Guan, YQ
    DATA MINING FOR SCIENTIFIC AND ENGINEERING APPLICATIONS, 2001, 2 : 357 - 381
  • [3] Topic modeling for mediated access to very large document collections
    Muresan, G
    Harper, DJ
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2004, 55 (10): : 892 - 910
  • [4] High-Speed Rough Clustering for Very Large Document Collections
    Kishida, Kazuaki
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2010, 61 (06): : 1092 - 1104
  • [6] Managing information disparity in multilingual document collections
    Duh, K. (kevinduh@gmail.com), 2013, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States (10):
  • [7] THESUS: Organizing Web document collections based on link semantics
    Halkidi, M
    Nguyen, B
    Varlamis, I
    Vazirgiannis, M
    VLDB JOURNAL, 2003, 12 (04): : 320 - 332
  • [8] THESUS: Organizing Web document collections based on link semantics
    Maria Halkidi
    Benjamin Nguyen
    Iraklis Varlamis
    Michalis Vazirgiannis
    The VLDB Journal, 2003, 12 : 320 - 332
  • [9] Facilitating Understanding of Large Document Collections
    Bae, Jae Hyeon
    Xu, Weijia
    Esteva, Maria
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 1334 - 1338
  • [10] Fast categorisation of large document collections
    Shanks, V
    Williams, HE
    EIGHTH SYMPOSIUM ON STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2001, : 194 - 204