A Novel TF-IDF Weighting Scheme for Effective Ranking

被引:0
|
作者
Paik, Jiaul H. [1 ]
机构
[1] Indian Stat Inst, Kolkata, India
关键词
Document ranking; Retrieval model; Term weighting; INFORMATION-RETRIEVAL; MODEL;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Term weighting schemes are central to the study of information retrieval systems. This article proposes a novel TF-IDF term weighting scheme that employs two different within document term frequency normalizations to capture two different aspects of term saliency. One component of the term frequency is effective for short queries, while the other performs better on long queries. The final weight is then measured by taking a weighted combination of these components, which is determined on the basis of the length of the corresponding query. Experiments conducted on a large number of TREC news and web collections demonstrate that the proposed scheme almost always outperforms five state of the art retrieval models with remarkable significance and consistency. The experimental results also show that the proposed model achieves significantly better precision than the existing models.
引用
收藏
页码:343 / 352
页数:10
相关论文
共 50 条
  • [1] Text Classification Using Novel Term Weighting Scheme-Based Improved TF-IDF for Internet Media Reports
    Jiang, Zhiying
    Gao, Bo
    He, Yanlin
    Han, Yongming
    Doyle, Paul
    Zhu, Qunxiong
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021
  • [2] Turning from TF-IDF to TF-IGM for term weighting in text classification
    Chen, Kewen
    Zhang, Zuping
    Long, Jun
    Zhang, Hao
    EXPERT SYSTEMS WITH APPLICATIONS, 2016, 66 : 245 - 260
  • [3] TF-IDF Method in Ranking Keywords of Instagram Users' Image Captions
    Kuncoro, Bernardus Ari
    Iswanto, Bambang Heru
    2015 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY SYSTEMS AND INNOVATION (ICITSI), 2015,
  • [4] Sentiment analysis using TF-IDF weighting of UK MPs' tweets on Brexit
    Mee, Alexander
    Homapour, Elmina
    Chiclana, Francisco
    Engel, Ofer
    KNOWLEDGE-BASED SYSTEMS, 2021, 228
  • [5] Vector Space Model Based on Lucene Index and TF-IDF Weighting Algorithm
    Yang, Xiaodan
    Jia, Bo
    PROCEEDINGS OF 2010 ASIA-PACIFIC YOUTH CONFERENCE ON COMMUNICATION, VOLS 1 AND 2, 2010, : 20 - 23
  • [6] Naive Bayes Text Categorization Algorithm Based on TF-IDF Attribute Weighting
    Jiang, Feng
    Zhang, Zhenghao
    Chen, Ping
    Liu, Yongrui
    PROCEEDINGS OF 2018 THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (CSAI 2018) / 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY (ICIMT 2018), 2018, : 521 - 525
  • [7] Sparsity Aware of TF-IDF Matrix to Accelerate Oblivious Document Ranking and Retrieval
    Zhang, Zeshi
    Xu, Guangping
    Yang, Hongzhang
    Wu, Yulei
    2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 974 - 981
  • [8] Optimization of Associative Knowledge Graph using TF-IDF based Ranking Score
    Kim, Hyun-Jin
    Baek, Ji-Won
    Chung, Kyungyong
    APPLIED SCIENCES-BASEL, 2020, 10 (13):
  • [9] Document Clustering: TF-IDF approach
    Bafna, Prafulla
    Pramod, Dhanya
    Vaidya, Anagha
    2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, AND OPTIMIZATION TECHNIQUES (ICEEOT), 2016, : 61 - 66
  • [10] Deriving TF-IDF as a Fisher kernel
    Elkan, Charles
    STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2005, 3772 : 295 - 300