Cluster analysis for the selection of potential discriminatory variables and the identification of subgroups in archaeometry

被引:3
|
作者
Lopez-Garcia, Pedro A. [1 ]
Argote, Denisse L. [2 ]
机构
[1] Escuela Nacl Antropol Hist, Posgrad Arqueol, Perifer Sur Esq,Calle Zapote,Col Isidro Fabela, Mexico City, Mexico
[2] Inst Nacl Antropol & Hist, Direcc Estudios Arqueol, Tacuba 76,Colonia Ctr, Mexico City, Mexico
关键词
Archaeological glass; High-dimensional data; Dimensionality reduction; Feature selection; Databionic Swarm; Datavisualization; COMPOSITIONAL DATA-ANALYSIS; R PACKAGE; MODEL; GLASS; CLASSIFICATION; KNOWLEDGE; ANTWERP;
D O I
10.1016/j.jasrep.2023.104022
中图分类号
K85 [文物考古];
学科分类号
0601 ;
摘要
In this article, three variable selection methods based on Gaussian mixture models were compared to find a subset of variables that provided the "best" clustering. The use of an appropriate transformation for composi-tional data, whose geometric space is the Simplex, is emphasized. The comparison revealed the ability of the models to cluster data in multiple phases, showing to be more convenient to select the relevant variables than to perform an analysis based on 2D plots or by simultaneously including all the available variables in a multivariate analysis. Once the informative variables for the clustering were obtained, we used a method called Databionic Swarm (DBS). This method uses unsupervised machine learning, taking advantage of emergence and swarm intelligence applied to find natural chemical groups in the input data space. DBS can visualize high-dimensional distances in the projection through a 3D topographic map with hypsometric tints. The results were compared in terms of accuracy, both in the selection of the variables and in the classification, using a supervised accuracy index for clustering and two unsupervised indexes (the Heatmap and the Silhouette plot). The concepts and methods were illustrated by applying them to two published archaeological glass data sets. The first set consisted of 245 Romano-British glass vessels and the second set of 180 glass vessels from the 15th-17th century in Antwerp. In these applications, it was found that the methods for the selection of variables increased the ac-curacy of the classification compared to traditional methods.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Identification of discriminatory variables in proteomics data analysis by clustering of variables
    Karimi, Sadegh
    Hemmateenejad, Bahram
    ANALYTICA CHIMICA ACTA, 2013, 767 : 35 - 43
  • [2] Cluster analysis for identifying sub-groups and selecting potential discriminatory variables in human encephalitis
    Jemila S Hamid
    Christopher Meaney
    Natasha S Crowcroft
    Julia Granerod
    Joseph Beyene
    BMC Infectious Diseases, 10
  • [3] Cluster analysis for identifying sub-groups and selecting potential discriminatory variables in human encephalitis
    Hamid, Jemila S.
    Meaney, Christopher
    Crowcroft, Natasha S.
    Granerod, Julia
    Beyene, Joseph
    BMC INFECTIOUS DISEASES, 2010, 10
  • [4] Cluster and Heatmap Analysis in Idiopathic Dilated Cardiomyopathy (IDC): Discriminatory Variables
    Revelo, Monica P.
    Hammond, Eliza B.
    Snow, Greg L.
    Miller, Dylan V.
    Stehlik, Josef
    Drakos, Stavros G.
    Hammond, Elizabeth H.
    Kfoury, Abdalah G.
    JOURNAL OF CARDIAC FAILURE, 2016, 22 (08) : S74 - S75
  • [5] Selection of Variables for Cluster Analysis and Classification Rules
    Fraiman, Ricardo
    Justel, Ana
    Svarc, Marcela
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (483) : 1294 - 1303
  • [6] WEIGHTING AND SELECTION OF VARIABLES FOR CLUSTER-ANALYSIS
    GNANADESIKAN, R
    KETTENRING, JR
    TSAO, SL
    JOURNAL OF CLASSIFICATION, 1995, 12 (01) : 113 - 136
  • [7] DISCRIMINATORY ANALYSIS OF QUALITATIVE VARIABLES
    LINDER, A
    METRIKA, 1963, 6 (02) : 76 - 83
  • [8] Identification of subgroups in geographic atrophy using cluster analysis
    Blames, Marc
    Mones, Jordi
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2016, 57 (12)
  • [9] Identifying fibromyalgia subgroups using cluster analysis: Relationships with clinical variables
    Yim, Y-R
    Lee, K-E
    Park, D-J
    Kim, S-H
    Nah, S-S
    Lee, J. H.
    Kim, S-K
    Lee, Y-A
    Hong, S-J
    Kim, H-S
    Lee, H-S
    Kim, H. A.
    Joung, C-, I
    Kim, S-H
    Lee, S-S
    EUROPEAN JOURNAL OF PAIN, 2017, 21 (02) : 374 - 384
  • [10] SELECTION OF CLUSTER DEFINING VARIABLES
    SAMPSON, P
    JOURNAL OF THE MARKET RESEARCH SOCIETY, 1974, 16 (04): : 303 - 304