Maximum likelihood estimation of mixture densities for binned and truncated multivariate data

被引:27
|
作者
Cadez, IV [1 ]
Smyth, P
McLachlan, GJ
McLaren, CE
机构
[1] Univ Calif Irvine, Dept Informat & Comp Sci, Irvine, CA 92697 USA
[2] Univ Queensland, Dept Math, Brisbane, Qld 4072, Australia
[3] Univ Calif Irvine, Dept Med, Div Epidemiol, Irvine, CA 92697 USA
基金
美国国家科学基金会; 美国国家卫生研究院; 澳大利亚研究理事会;
关键词
EM; binned; truncated; histogram; mixture model; KL-distance; iron deficiency anemia;
D O I
10.1023/A:1013679611503
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.
引用
收藏
页码:7 / 34
页数:28
相关论文
共 50 条
  • [41] Maximum likelihood estimation of the mean of a multivariate normal population with monotone incomplete data
    Romer, Megan M.
    Richards, Donald St P.
    STATISTICS & PROBABILITY LETTERS, 2010, 80 (17-18) : 1284 - 1288
  • [42] The conditional maximum likelihood estimation for the Cox-Aalen model with doubly truncated data
    Su, Chun-Lung
    Shen, Pao-sheng
    STATISTICS, 2025, 59 (01) : 228 - 245
  • [43] Maximum Likelihood Estimation of the Multivariate Normal Mixture Model (vol 104, pg 1539, 2009)
    Boldea, Otilia
    Magnus, Jan R.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024, 119 (547) : 2423 - 2423
  • [44] MAXIMUM LIKELIHOOD ESTIMATION UNDER INCORRECT FAMILY OF DENSITIES
    PARK, H
    ANNALS OF MATHEMATICAL STATISTICS, 1965, 36 (01): : 360 - &
  • [45] Maximum likelihood estimation of smooth monotone and unimodal densities
    Eggermont, PPB
    LaRiccia, VN
    ANNALS OF STATISTICS, 2000, 28 (03): : 922 - 947
  • [46] A maximum likelihood algorithm for the estimation and renormalization of exponential densities
    Stinis, P
    JOURNAL OF COMPUTATIONAL PHYSICS, 2005, 208 (02) : 691 - 703
  • [47] Finite Mixture Model: A Maximum Likelihood Estimation Approach On Time Series Data
    Yen, Phoong Seuk
    Ismail, Mohd Tahir
    Hamzah, Firdaus Mohamad
    STATISTICS AND OPERATIONAL RESEARCH INTERNATIONAL CONFERENCE (SORIC 2013), 2014, 1613 : 130 - 137
  • [48] Maximum Likelihood Estimation of Semiparametric Mixture Component Models for Competing Risks Data
    Choi, Sangbum
    Huang, Xuelin
    BIOMETRICS, 2014, 70 (03) : 588 - 598
  • [49] Maximum Full Likelihood Approach to Randomly Truncated Data
    Cheng, Manli
    Liu, Yukun
    Ma, Huijuan
    Qin, Jing
    JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY, 2024,
  • [50] Quasi-maximum likelihood estimation of multivariate diffusions
    Huang, Xiao
    STUDIES IN NONLINEAR DYNAMICS AND ECONOMETRICS, 2013, 17 (02): : 179 - 197