Text Retrieval Priors for Bayesian Logistic Regression

被引:6
|
作者
Yang, Eugene [1 ]
Lewis, David D. [2 ]
Frieder, Ophir [1 ]
机构
[1] Georgetown Univ, IR Lab, Washington, DC 20057 USA
[2] Cyxtera Technol, Dallas, TX USA
关键词
text classification; regularization; ad hoc retrieval; Bayesian priors; Bayesian logistic regression;
D O I
10.1145/3331184.3331299
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Discriminative learning algorithms such as logistic regression excel when training data are plentiful, but falter when it is meager. An extreme case is text retrieval (zero training data), where discriminative learning is impossible and heuristics such as BM25, which combine domain knowledge (a topical keyword query) with generative learning (Naive Bayes), are dominant. Building on past work, we show that BM25-inspired Gaussian priors for Bayesian logistic regression based on topical keywords provide better effectiveness than the usual L2 (zero mode, uniform variance) Gaussian prior. On two high recall retrieval datasets, the resulting models transition smoothly from BM25 level effectiveness to discriminative effectiveness as training data volume increases, dominating L2 regularization even when substantial training data is available.
引用
收藏
页码:1045 / 1048
页数:4
相关论文
共 50 条
  • [21] POSTER: A novel Content-based Image Retrieval system based on Bayesian Logistic Regression
    Arias-Nicolas, J. P.
    Calle-Alonso, F.
    WSCG 2010: POSTER PROCEEDINGS, 2010, : 19 - 22
  • [22] Laplace Power-Expected-Posterior Priors for Logistic Regression
    Porwal, Anupreet
    Rodriguez, Abel
    BAYESIAN ANALYSIS, 2024, 19 (04): : 1163 - 1186
  • [23] Conjugate priors and variable selection for Bayesian quantile regression
    Alhamzawi, Rahim
    Yu, Keming
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2013, 64 : 209 - 219
  • [24] Cauchy and other shrinkage priors for logistic regression in the presence of separation
    Ghosh, Joyee
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2019, 11 (06):
  • [25] Exploiting Informative Priors for Bayesian Classification and Regression Trees
    Angelopoulos, Nicos
    Cussens, James
    19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05), 2005, : 641 - 646
  • [26] Bayesian analysis in multivariate regression models with conjugate priors
    Arashi, M.
    Iranmanesh, Anis
    Norouzirad, M.
    Jenatabadi, Hashem Salarzadeh
    STATISTICS, 2014, 48 (06) : 1324 - 1334
  • [27] Survival Regression Models With Dependent Bayesian Nonparametric Priors
    Riva-Palacio, Alan
    Leisen, Fabrizio
    Griffin, Jim
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2022, 117 (539) : 1530 - 1539
  • [28] Data augmentation priors for Bayesian and semi-Bayes analyses of conditional-logistic and proportional-hazards regression
    Greenland, S
    Christensen, R
    STATISTICS IN MEDICINE, 2001, 20 (16) : 2421 - 2428
  • [29] A novel approach based on logistic regression and Bayesian for relevance feedback in content-based image retrieval
    Kong, Jun
    Wang, Xuefeng
    Liu, Zhen
    Zhang, Xiaohua
    Cui, Jingxia
    Zhang, Jingbo
    CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 2, PROCEEDINGS, 2008, : 455 - 459
  • [30] Logistic regression against a divergent Bayesian network
    Sanchez Trujillo, Noel Antonio
    MEDWAVE, 2015, 15 (01):