The R Package Ecosystem for Robust Statistics

被引:0
|
作者
Todorov, Valentin [1 ]
机构
[1] United Nations Ind Dev Org UNIDO, Vienna, Austria
来源
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS | 2024年 / 16卷 / 06期
关键词
high dimensions; multivariate; outlier; R; robust; PRINCIPAL COMPONENT ANALYSIS; PROJECTION-PURSUIT APPROACH; MULTIVARIATE LOCATION; OUTLIER DETECTION; FAST ALGORITHM; REGRESSION; ESTIMATORS; COVARIANCE; DISPERSION; SCATTER;
D O I
10.1002/wics.70007
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In the last few years, the number of R packages implementing different robust statistical methods have increased substantially. There are now numerous packages for computing robust multivariate location and scatter, robust multivariate analysis like principal components and discriminant analysis, robust linear models, and other algorithms dedicated to cope with outliers and other irregularities in the data. This abundance of package options may be overwhelming for both beginners and more experienced R users. Here we provide an overview of the most important 25 R packages for different tasks. As metrics for the importance of each package, we consider its maturity and history, the number of total and average monthly downloads from CRAN (The Comprehensive R Archive Network), and the number of reverse dependencies. Then we briefly describe what each of these package does. After that we elaborate on the several above-mentioned topics of robust statistics, presenting the methodology and the implementation in R and illustrating the application on real data examples. Particular attention is paid to the robust methods and algorithms suitable for high-dimensional data. The code for all examples is accessible on the GitHub repository .
引用
收藏
页数:30
相关论文
共 50 条
  • [21] DepthTools: an R package for a robust analysis of gene expression data
    Torrente, Aurora
    Lopez-Pintado, Sara
    Romo, Juan
    BMC BIOINFORMATICS, 2013, 14
  • [22] BOOTSTRAPPING ROBUST STATISTICS FOR MARKOVIAN DATA APPLICATIONS TO REGENERATIVE R-STATISTICS AND L-STATISTICS
    Bertail, Patrice
    Clemencon, Stephan
    Tressou, Jessica
    JOURNAL OF TIME SERIES ANALYSIS, 2015, 36 (03) : 462 - 480
  • [23] A complex network analysis of the Comprehensive R Archive Network (CRAN) package ecosystem
    Mora-Cantallops, Marcal
    Sanchez-Alonso, Salvador
    Garcia-Barriocanal, Elena
    JOURNAL OF SYSTEMS AND SOFTWARE, 2020, 170
  • [24] dycdtools: an R Package for Assisting Calibration and Visualising Outputs of an Aquatic Ecosystem Model
    Yu, Songyan
    McBride, Christopher G.
    Frassl, Marieke A.
    Hipsey, Matthew R.
    Hamilton, David P.
    R JOURNAL, 2022, 14 (04): : 235 - 251
  • [25] Statistics: Multivariate Data Integration Using R; Methods and Applications With the mixOmics Package
    Podgorski, Krzysztof
    INTERNATIONAL STATISTICAL REVIEW, 2024, 92 (03) : 483 - 484
  • [26] Phylogenetic tree statistics: A systematic overview using the new R package 'treestats'
    Janzen, Thijs
    Etienne, Rampal S.
    MOLECULAR PHYLOGENETICS AND EVOLUTION, 2024, 200
  • [27] Flexible Scan Statistics for Detecting Spatial Disease Clusters: The rflexscan R Package
    Otani, Takahiro
    Takahashi, Kunihiko
    JOURNAL OF STATISTICAL SOFTWARE, 2021, 99 (13): : 1 - 29
  • [28] LECA: Educational package with graphical user interface for descriptive statistics and probability in R
    Barbosa, Ana Carolina A.
    Gebert, Deyse M. P.
    Kist, Airton
    SIGMAE, 2019, 8 (02): : 306 - 314
  • [29] Improving Bayesian statistics understanding in the age of Big Data with the bayesvl R package
    Quan-Hoang Vuong
    Viet-Phuong La
    Minh-Hoang Nguyen
    Manh-Toan Ho
    Manh-Tung Ho
    Mantello, Peter
    SOFTWARE IMPACTS, 2020, 4
  • [30] kaphom: An R package for testing the homogeneity of intra-class kappa statistics
    Albayrak, Muammer
    Turhan, Kemal
    Yavuz, Yasemin
    Aydin Kasap, Zeliha
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2020, 49 (12) : 3283 - 3298