Probabilistic data fusion on a large document collection

被引:1
|
作者
Lillis, David [1 ]
Toolan, Fergus [2 ]
Collier, Rem [1 ]
Dunnion, John [1 ]
机构
[1] Univ Coll Dublin, Sch Comp Sci & Informat, Dublin 4, Ireland
[2] Griffith Coll Dublin, Fac Computing Sci, Dublin 8, Ireland
关键词
data fusion; information retrieval; ProbFuse;
D O I
10.1007/s10462-007-9037-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data fusion is the process of combining the output of a number of Information Retrieval (IR) algorithms into a single result set, to achieve greater retrieval performance. ProbFuse is a data fusion algorithm that uses the history of the underlying IR algorithms to estimate the probability that subsequent result sets include relevant documents in particular positions. It has been shown to out-perform CombMNZ, the standard data fusion algorithm against which to compare performance, in a number of previous experiments. This paper builds upon this previous work and applies probFuse to the much larger Web Track document collection from the 2004 Text REtreival Conference. The performance of probFuse is compared against that of CombMNZ using a number of evaluation measures and is shown to achieve substantial performance improvements.
引用
收藏
页码:23 / 34
页数:12
相关论文
共 50 条
  • [1] Probabilistic data fusion on a large document collection
    David Lillis
    Fergus Toolan
    Rem Collier
    John Dunnion
    Artificial Intelligence Review, 2006, 26 : 23 - 34
  • [2] The Nuremberg trial collection: Managing a large document collection
    Proc Natl Online Meet, 1600, (157-168):
  • [3] Probabilistic methods for data fusion
    Mohammad-Djafari, A
    MAXIMUM ENTROPY AND BAYESIAN METHODS, 1998, 98 : 57 - 69
  • [4] On Probabilistic Data Collection in the NOTICE Architecture
    El-Tawab, Samy
    Wang, Xianping
    Alhafdhi, Ahmed
    Olariu, Stephan
    2014 IEEE 11TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SENSOR SYSTEMS (MASS), 2014, : 642 - 650
  • [5] Probabilistic data association applications to data fusion
    Quaranta, Carlo
    Balzarotti, Giorgio
    OPTICAL ENGINEERING, 2008, 47 (02)
  • [6] Real-Time Probabilistic Data Fusion for Large-Scale IoT Applications
    Akbar, Adnan
    Kousiouris, George
    Pervaiz, Haris
    Sancho, Juan
    Ta-Shma, Paula
    Carrez, Francois
    Moessner, Klaus
    IEEE ACCESS, 2018, 6 : 10015 - 10027
  • [7] Data fusion with probabilistic conditional logic
    Fisseler, Jens
    Feher, Imre
    LOGIC JOURNAL OF THE IGPL, 2010, 18 (04) : 488 - 507
  • [8] Snapshot/Continuous Data Collection Capacity for Large-Scale Probabilistic Wireless Sensor Networks
    Ji, Shouling
    Beyah, Raheem
    Cai, Zhipeng
    2012 PROCEEDINGS IEEE INFOCOM, 2012, : 1035 - 1043
  • [9] Toward Probabilistic Data Collection in the NOTICE Architecture
    Wang, Xianping
    El-Tawab, Samy
    Alhafdhi, Ahmed
    Almalag, Mohammad
    Olariu, Stephan
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2016, 17 (12) : 3354 - 3363
  • [10] DATA COMPRESSION OF LARGE DOCUMENT DATA BASES
    HEAPS, HS
    JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1975, 15 (01): : 32 - 39