Compositional data analysis by the square-root transformation: Application to NBA USG% data

被引:0
|
作者
Lee, Jeseok [1 ]
Kim, Byungwon [1 ]
机构
[1] Kyungpook Natl Univ, Dept Stat, 80 Daehak Ro, Daegu 41566, South Korea
关键词
compositional data analysis; log-ratio transformation; square-root transformation; sports data analysis; clustering;
D O I
10.29220/CSAM.2024.31.3.349
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Compositional data refers to data where the sum of the values of the components is a constant, hence the sample space is defined as a simplex making it impossible to apply statistical methods developed in the usual Euclidean vector space. A natural approach to overcome this restriction is to consider an appropriate transformation which moves the sample space onto the Euclidean space, and log-ratio typed transformations, such as the additive log-ratio (ALR), the centered log-ratio (CLR) and the isometric log-ratio (ILR) transformations, have been mostly conducted. However, in scenarios with sparsity, where certain components take on exact zero values, these log-ratio type transformations may not be e ff ective. In this work, we mainly suggest an alternative transformation, that is the square-root transformation which moves the original sample space onto the directional space. We compare the square-root transformation with the log-ratio typed transformation by the simulation study and the real data example. In the real data example, we applied both types of transformations to the USG% data obtained from NBA, and used a density based clustering method, DBSCAN (density-based spatial clustering of applications with noise), to show the result.
引用
收藏
页码:349 / 363
页数:15
相关论文
共 50 条