Measuring semantic similarity of documents with weighted cosine and fuzzy logic

被引:4
|
作者
Huetle-Figueroa, Juan [1 ]
Perez-Tellez, Fernando [1 ]
Pinto, David [2 ]
机构
[1] Technol Univ Dublin, Dept Comp, Blessington Rd, Dublin D24 FKT9, Ireland
[2] Benemerita Univ Autonoma Puebla, PUE, Fac Comp Sci, Puebla, Mexico
关键词
Semantic similarity; semantic matching; document similarity; cosine enrichment; keyword enrichment;
D O I
10.3233/JIFS-179889
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Currently, the semantic analysis is used by different fields, such as information retrieval, the biomedical domain, and natural language processing. The primary focus of this research work is on using semantic methods, the cosine similarity algorithm, and fuzzy logic to improve the matching of documents. The algorithms were applied to plain texts in this case CVs (resumes) and job descriptions. Synsets of WordNet were used to enrich the semantic similarity methods such as the Wu-Palmer Similarity (WUP), Leacock-Chodorow similarity (LCH), and path similarity (hypernym/hyponym). Additionally, keyword extraction was used to create a postings list where keywords were weighted. The task of recruiting new personnel in the companies that publish job descriptions and reciprocally finding a company when workers publish their resumes is discussed in this research work. The creation of a new gold standard was required to achieve a comparison of the proposed methods. A web application was designed to match the documents manually, creating the new gold standard. Thereby the new gold standard confirming benefits of enriching the cosine algorithm semantically. Finally, the results were compared with the new gold standard to check the efficiency of the new methods proposed. The measures used for the analysis were precision, recall, and f-measure, concluding that the cosine similarity weighted semantically can be used to get better similarity scores.
引用
收藏
页码:2263 / 2278
页数:16
相关论文
共 50 条
  • [21] Distance Weighted Cosine Similarity Measure for Text Classification
    Li, Baoli
    Han, Liping
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2013, 2013, 8206 : 611 - 618
  • [22] Visual Tracking via Weighted Local Cosine Similarity
    Wang, Dong
    Lu, Huchuan
    Bo, Chunjuan
    IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (09) : 1838 - 1850
  • [23] SWE: a novel method with semantic-weighted edge for measuring gene functional similarity
    Tian, Zhen
    Fang, Haichuan
    Ye, Yangdong
    Zhu, Zhenfeng
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 1672 - 1678
  • [24] A Novel Method for Measuring Structure and Semantic Similarity of XML Documents Based on Extended Adjacency Matrix
    Zhang, Xue-Liang
    Yang, Ting
    Fan, Bao-Quan
    Wang, Xu
    Wei, Jin-Mao
    INTERNATIONAL CONFERENCE ON APPLIED PHYSICS AND INDUSTRIAL ENGINEERING 2012, PT B, 2012, 24 : 1452 - 1461
  • [25] A comprehensive weighted semantic similarity algorithm
    School of Computer and Communication, Lanzhou University of Technology, Lanzhou, China
    J. Inf. Comput. Sci., 11 (4395-4404):
  • [26] A Hybrid Approach for Measuring Semantic Similarity between Documents and its Application in Mining the Knowledge Repositories
    Sumathy, K. L.
    Dr Chidambaram
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (08) : 231 - 237
  • [27] MeSHSim: An R/Bioconductor package for measuring semantic similarity over MeSH headings and MEDLINE documents
    Zhou, Jing
    Shui, Yuxuan
    Peng, Shengwen
    Li, Xuhui
    Mamitsuka, Hiroshi
    Zhu, Shanfeng
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2015, 13 (06)
  • [28] MeSHSim: An R/Bioconductor package for measuring semantic similarity over MeSH headings and MEDLINE documents
    Zhou Jing
    Shui Yuxuan
    Peng Shengwen
    Li Xuhui
    Mamitsuka, Hiroshi
    Zhu Shanfeng
    2015 34TH CHINESE CONTROL CONFERENCE (CCC), 2015, : 8535 - 8539
  • [29] Semantic Penumbra: Concept Similarity in Logic
    Woods, John
    TOPOI-AN INTERNATIONAL REVIEW OF PHILOSOPHY, 2012, 31 (01): : 121 - 134
  • [30] Semantic Penumbra: Concept Similarity in Logic
    John Woods
    Topoi, 2012, 31 : 121 - 134