Quantifying the information lost in optimal covariance matrix cleaning

被引:0
|
作者
Bongiorno, Christian [1 ]
Lamrani, Lamia [1 ]
机构
[1] Univ Paris Saclay, Lab Math & Informat Complex & Syst, 9 Rue Joliot Curie, F-91192 Gif Sur Yvette, France
关键词
Random matrix theory; Covariance matrix estimation; Genetic regressor programming; High-dimension statistics; Information theory; DIVERGENCE;
D O I
10.1016/j.physa.2024.130225
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Obtaining an accurate estimate of the underlying covariance matrix from finite sample data is challenging due to sample size noise. In recent years, sophisticated covariance-cleaning techniques based on random matrix theory have been proposed to address this issue. Most of these methods aim to achieve an optimal covariance matrix estimator by minimizing the Frobenius norm distance as a measure of the discrepancy between the true covariance matrix and the estimator. However, this practice offers limited interpretability in terms information theory. To better understand this relationship, we focus on the Kullback-Leibler divergence to quantify the information lost by the estimator. Our analysis centers on rotationally invariant estimators, which are state-of-art in random matrix theory, and we derive an analytical expression for their Kullback-Leibler divergence. Due to the intricate nature of the calculations, we use genetic programming regressors paired with human intuition. Ultimately, using approach, we formulate a conjecture validated through extensive simulations, showing that the Frobenius distance corresponds to a first-order expansion term of the Kullback-Leibler divergence, thus establishing a more defined link between the two measures.
引用
收藏
页数:9
相关论文
共 50 条