Probabilistic information retrieval approach for ranking of database query results

被引:48
作者
Chaudhuri, Surajit [1 ]
Das, Gautam
Hristidis, Vagelis
Weikum, Gerhard
机构
[1] Microsoft Res, 1 Microsoft Way, Redmond, WA 98052 USA
[2] Univ Texas, Dept Comp Sci & Engn, Arlington, TX 76019 USA
[3] Florida Int Univ, Sch Comp & Informat Sci, Miami, FL 33199 USA
[4] Max Planck Inst Informat, D-66123 Saarbrucken, Germany
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2006年 / 31卷 / 03期
关键词
experimentation; performance; theory; probabilistic information retrieval; user survey; indexing; automatic ranking; relational queries; workload;
D O I
10.1145/1166074.1166085
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We investigate the problem of ranking the answers to a database query when many tuples are returned. In particular, we present methodologies to tackle the problem for conjunctive and range queries, by adapting and applying principles of probabilistic models from information retrieval for structured data. Our solution is domain independent and leverages data and workload statistics and correlations. We evaluate the quality of our approach with a user survey on a real database. Furthermore, we present and experimentally evaluate algorithms to efficiently retrieve the top ranked results, which demonstrate the feasibility of our ranking system.
引用
收藏
页码:1134 / 1168
页数:35
相关论文
共 57 条
[1]  
AGRAWAL R, 1995, P KDD
[2]  
AGRAWAL S, 2002, P ICDE
[3]  
Amer-Yahia S, 2005, SIGMOD RECORD, V34, P71
[4]  
AMERYAHIA S, 2005, P VLDB
[5]  
[Anonymous], 1998, P 14 C UNC ART INT
[6]  
[Anonymous], P CIDR
[7]  
[Anonymous], 1996, P 19 ANN INT ACM SIG, DOI DOI 10.1145/243199.243202
[8]  
[Anonymous], P ACM SIGMOD C MAN D
[9]  
BAEZAYATES RA, 1999, MODERN INFORMATION R
[10]   THE MANAGEMENT OF PROBABILISTIC DATA [J].
BARBARA, D ;
GARCIAMOLINA, H ;
PORTER, D .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1992, 4 (05) :487-502