Ontology Based Document Data Analysis

被引:0
|
作者
Zafar, Ambreen [1 ]
Awais, Muhammad [1 ]
Aftab, Muhammad Ahmad [1 ]
机构
[1] GCUF, Dept Software Engn, Faisalabad, Pakistan
来源
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY | 2018年 / 18卷 / 11期
关键词
Clustering; Ontology; WordNet; Concept Weight; TF-IDF;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A vast amount of data is generating at a rapid pace over the internet by means of blogs, online forums and emails etc. The huge volume and complex semantics of unstructured data initiates the need of effective management for efficient retrieval. It is intricate for users to find right keywords for search to retrieve relevant search results. There also exist polysemous words in the vocabulary of every natural language i.e. words contributing different meaning according to the context. Additional relations among words such as super-subordinate relation (hypernym/hyponym) and part-whole relation (meronym/holonym) can also be incorporated to capture the semantics of user's query. The concept of document clustering along with ontology provides users with the opportunity to overcome difficulties associated with traditional keyword based search. It intends to reduce search time and enhance the retrieval of relevant documents. This research proposes a semantic-based document clustering technique by applying K-means clustering algorithm over concept weight matrix, computed using modified TF-IDF approach. The weights are calculated specifically for the features and their relations extracted from WordNet ontology. Silhouette coefficient is used as a measure of cluster purity.
引用
收藏
页码:42 / 48
页数:7
相关论文
共 50 条
  • [1] Ontology-Based Document Profile for Vulnerability Relevancy Analysis
    Wita, Ratsameetip
    Jiamnapanon, Nattanatch
    Teng-Amnuay, Yunyong
    SELECTED TOPICS IN APPLIED COMPUTER SCIENCE, 2010, : 210 - +
  • [2] Ontology based document enrichment in bioinformatics
    Stevens, R
    COMPARATIVE AND FUNCTIONAL GENOMICS, 2002, 3 (01): : 42 - 46
  • [3] An Ontology Based Model for Document Clustering
    Sridevi, U.
    Nagaveni, N.
    INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2011, 7 (03) : 54 - 69
  • [4] Semantic document clustering based on ontology
    Wang, Ying
    Peng, Tao
    Zuo, Wanli
    He, Fengling
    Wang, Dong
    Journal of Computational Information Systems, 2009, 5 (03): : 1437 - 1444
  • [5] Ontology-based document extraction processing
    Gu, N
    Wang, F
    Wu, GW
    PROCEEDINGS OF THE SEVENTH INTERNATIONAL CONFERENCE ON CSCW IN DESIGN, 2002, : 65 - 67
  • [6] An Ontology Learning Method Based on Document Clustering
    Wei, Xianmin
    FRONTIERS OF MANUFACTURING AND DESIGN SCIENCE II, PTS 1-6, 2012, 121-126 : 1911 - 1915
  • [7] Ontology-based text document clustering
    Staab, S
    Hotho, A
    INTELLIGENT INFORMATION PROCESSING AND WEB MINING, 2003, : 451 - 452
  • [8] A Text Document Clustering Method Based on Ontology
    Ding, Yi
    Fu, Xian
    ADVANCES IN NEURAL NETWORKS - ISNN 2011, PT II, 2011, 6676 : 199 - 206
  • [9] Ontology-based MEDLINE document classification
    Camous, Fabrice
    Blott, Stephen
    Smeaton, Alan F.
    BIOINFORMATICS RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2007, 4414 : 439 - +
  • [10] A Framework for Analysis of Ontology-Based Data Access
    Konys, Agnieszka
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2016, PT II, 2016, 9876 : 397 - 408