Bayesian Mixture Models with Focused Clustering for Mixed Ordinal and Nominal Data

被引:11
|
作者
DeYoreo, Maria [1 ,3 ]
Reiter, Jerome P. [1 ,4 ]
Hillygus, D. Sunshine [2 ,5 ]
机构
[1] Duke Univ, Dept Stat Sci, Durham, NC 27708 USA
[2] Duke Univ, Dept Polit Sci, Durham, NC USA
[3] Duke Univ, Durham, NC USA
[4] Duke Univ, Stat Sci, Durham, NC USA
[5] Duke Univ, Polit Sci, Durham, NC USA
来源
BAYESIAN ANALYSIS | 2017年 / 12卷 / 03期
基金
美国国家科学基金会;
关键词
categorical; missing; mixture model; multiple imputation; MULTIPLE IMPUTATION; CATEGORICAL-DATA; BINARY;
D O I
10.1214/16-BA1020
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
In some contexts, mixture models can fit certain variables well at the expense of others in ways beyond the analyst's control. For example, when the data include some variables with non-trivial amounts of missing values, the mixture model may fit the marginal distributions of the nearly and fully complete variables at the expense of the variables with high fractions of missing data. Motivated by this setting, we present a mixture model for mixed ordinal and nominal data that splits variables into two groups, focus variables and remainder variables. The model allows the analyst to specify a rich sub-model for the focus variables and a simpler sub-model for remainder variables, yet still capture associations among the variables. Using simulations, we illustrate advantages and limitations of focused clustering compared to mixture models that do not distinguish variables. We apply the model to handle missing values in an analysis of the 2012 American National Election Study, estimating relationships among voting behavior, ideology, and political party affiliation.
引用
收藏
页码:679 / 703
页数:25
相关论文
共 50 条
  • [31] A dissimilarity measure for mixed nominal and ordinal attribute data in k-Modes algorithm
    Fang Yuan
    Youlong Yang
    Tiantian Yuan
    Applied Intelligence, 2020, 50 : 1498 - 1509
  • [32] Clustering Upper Level Units in Multilevel Models for Ordinal Data
    Grilli, Leonardo
    Panzera, Agnese
    Rampichini, Carla
    CLASSIFICATION, (BIG) DATA ANALYSIS AND STATISTICAL LEARNING, 2018, : 137 - 144
  • [33] A dissimilarity measure for mixed nominal and ordinal attribute data in k-Modes algorithm
    Yuan, Fang
    Yang, Youlong
    Yuan, Tiantian
    APPLIED INTELLIGENCE, 2020, 50 (05) : 1498 - 1509
  • [34] A Unified Entropy-Based Distance Metric for Ordinal-and-Nominal-Attribute Data Clustering
    Zhang, Yiqun
    Cheung, Yiu-Ming
    Tan, Kay Chen
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (01) : 39 - 52
  • [35] Learnable Weighting of Intra-Attribute Distances for Categorical Data Clustering with Nominal and Ordinal Attributes
    Zhang, Yiqun
    Cheung, Yiu-ming
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3560 - 3576
  • [36] Bayesian approach for mixture models with grouped data
    Gau, Shiow-Lan
    Tapsoba, Jean de Dieu
    Lee, Shen-Ming
    COMPUTATIONAL STATISTICS, 2014, 29 (05) : 1025 - 1043
  • [37] Bayesian approach for mixture models with grouped data
    Shiow-Lan Gau
    Jean de Dieu Tapsoba
    Shen-Ming Lee
    Computational Statistics, 2014, 29 : 1025 - 1043
  • [38] Bayesian mixed membership models for soft clustering and classification
    Erosheva, EA
    Fienberg, SE
    CLASSIFICATION - THE UBIQUITOUS CHALLENGE, 2005, : 11 - 26
  • [39] Bayesian mixture models for cytometry data analysis
    Lin, Lin
    Hejblum, Boris P.
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2021, 13 (04)
  • [40] Bayesian analysis of ordinal categorical data under a mixed inheritance model
    Skotarczak, Ewa
    Molinski, Krzysztof
    Szwaczkowski, Tomasz
    Dobek, Anita
    ARCHIV FUR TIERZUCHT-ARCHIVES OF ANIMAL BREEDING, 2011, 54 (01): : 93 - 103