Non-Negative Bregman Divergence Minimization for Deep Direct Density Ratio Estimation

被引:0
|
作者
Kato, Masahiro [1 ]
Teshima, Takeshi [2 ]
机构
[1] CyberAgent Inc, Tokyo, Japan
[2] Univ Tokyo, Tokyo, Japan
关键词
COVARIATE SHIFT; INFERENCE; MIXTURE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Density ratio estimation (DRE) is at the core of various machine learning tasks such as anomaly detection and domain adaptation. In existing studies on DRE, methods based on Bregman divergence (BD) minimization have been extensively studied. However, BD minimization when applied with highly flexible models, such as deep neural networks, tends to suffer from what we call train-loss hacking, which is a source of overfitting caused by a typical characteristic of empirical BD estimators. In this paper, to mitigate train-loss hacking, we propose a non-negative correction for empirical BD estimators. Theoretically, we confirm the soundness of the proposed method through a generalization error bound. Through our experiments, the proposed methods show a favorable performance in inlier-based outlier detection.
引用
收藏
页数:14
相关论文
共 50 条
  • [2] Density-ratio matching under the Bregman divergence: a unified framework of density-ratio estimation
    Masashi Sugiyama
    Taiji Suzuki
    Takafumi Kanamori
    Annals of the Institute of Statistical Mathematics, 2012, 64 : 1009 - 1044
  • [3] Density-ratio matching under the Bregman divergence: a unified framework of density-ratio estimation
    Sugiyama, Masashi
    Suzuki, Taiji
    Kanamori, Takafumi
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2012, 64 (05) : 1009 - 1044
  • [4] Semiparametric density estimation with localized Bregman divergence
    Matsuno, Daisuke
    Naito, Kanta
    JOURNAL OF MULTIVARIATE ANALYSIS, 2025, 207
  • [5] Continuous non-negative wavelets and their use in density estimation
    Walter, GG
    Shen, XP
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1999, 28 (01) : 1 - 17
  • [6] Non-negative matrix factorization with α-divergence
    Cichocki, Andrzej
    Lee, Hyekyoung
    Kim, Yong-Deok
    Choi, Seungjin
    PATTERN RECOGNITION LETTERS, 2008, 29 (09) : 1433 - 1440
  • [7] NOISE-TO-MASK RATIO MINIMIZATION BY WEIGHTED NON-NEGATIVE MATRIX FACTORIZATION
    Nikunen, Joonas
    Virtanen, Tuomas
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 25 - 28
  • [8] A note on kernel density estimation for non-negative random variables
    Sclocco T.
    Di Marzio M.
    Statistical Methods and Applications, 2001, 10 (1-3) : 67 - 79
  • [9] Iterative Non-negative Deconvolution Algorithms with α-Divergence
    Teng, Yueyang
    Yan, Jing
    Kang, Yan
    2015 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS IHMSC 2015, VOL II, 2015,
  • [10] Non-negative Matrix Factorization based on γ-Divergence
    Machida, Kohei
    Takenouchi, Takashi
    2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,