Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference: scaling of scatter matrix to get covariance #3220

Open
josef-pkt opened this issue Oct 1, 2016 · 9 comments
Open

Reference: scaling of scatter matrix to get covariance #3220

josef-pkt opened this issue Oct 1, 2016 · 9 comments

Comments

@josef-pkt
Copy link
Member

josef-pkt commented Oct 1, 2016

(parking a reference to computational detail)

how do we normalize a scatter matrix so that it is consistent for specific distribution, commonly the normal
cov = sigma = c scatter -> find "size" c

related: mad, iqr and similar have normalization constants
here it is for the multivariate case

Maronna et al text book on robust statistics 2006 section 6.3.2 on page 186
using chi2 distribution for mahalanobis distances we can calculat
c = median( {d_i}_i ) / chi2.ppf(0.5, k_vars)

this has also been used without reference in Maronna and Zamar 2002 on cov_ogk.
I didn't see anything mentioned for the "size" estimates in Tyler estimator for scatter in elliptical distribution.

There are several references (*2) for consistency and small sample scaling of MCD and similar but I didn't look carefully (brief browsing or skimming doesn't show any obvious answer)
Many articles just mention the scaling factors but they don't show the numbers or formulas.

Maronna, Ricardo A., Douglas Martin, and Víctor J. Yohai. 2006. Robust Statistics: Theory and Methods. Reprinted with corr. Wiley Series in Probability and Statistics. Chichester: Wiley.

Maronna, Ricardo A., and Ruben H. Zamar. 2002. “Robust Estimates of Location and Dispersion for High-Dimensional Datasets.” Technometrics 44 (4): 307–17. doi:10.1198/004017002188618509.

(*2)
Hardin, Johanna, and David M. Rocke. 2005. “The Distribution of Robust Distances.” Journal of Computational and Graphical Statistics 14 (4): 928–46. doi:10.1198/106186005X77685.
Pison, G., S. Van Aelst, and G. Willems. n.d. “Small Sample Corrections for LTS and MCD.” Metrika 55 (1–2): 111–23. doi:10.1007/s001840200191.

@josef-pkt
Copy link
Member Author

adding this here:

We should have some helper function or additional methods attached to the robust norms to calculate consistency, relative efficiency and similar.
Those are needed all over the place, and it would be useful to find them in a central location instead of hardcoding a specific version into each function.

@josef-pkt
Copy link
Member Author

reminder to myself "M:\josef\eclipsegworkspace\statsmodels-git\local_scripts\local_scripts\try_rlm_winsorized.py"
has the variance for truncated mean calculation
ELTS also should have a truncation correction somewhere

@josef-pkt
Copy link
Member Author

example calculation using scipy.stats expect method
(found in old script, and I don't remember the specifics, likely trying for M- or S-estimator of scale,
log for try_robust_scale.py or try_robust_scale_iter.py)

(lines copied out of sequence)

>>> norm = rnorms.HuberT(2.5)
>>> norm = rnorms.TukeyBiweight()
>>> stats.norm.expect(lambda x, *args: norm.psi(np.abs(x**2)))
0.8093246617772843
>>> stats.norm.expect(lambda x, *args: rnorms.TukeyBiweight().rho(x))
0.43684963023076195
>>> stats.norm.expect(lambda x, *args: norm.rho(x))
0.3692679350253787

>>> norm = rnorms.TukeyBiweight()
>>> stats.norm.expect(lambda x, *args: x * norm.psi(x))
0.7577759186353068

>>> stats.norm.expect(lambda x, *args: x * norm.psi(x) - norm.rho(x))
0.3209262884898523

@josef-pkt
Copy link
Member Author

josef-pkt commented Oct 5, 2016

this is also related to #3181

another related
Croux, Christophe, and Catherine Dehon. 2010. “Influence Functions of the Spearman and Kendall Correlation Measures.” Statistical Methods & Applications 19 (4): 497–515. doi:10.1007/s10260-010-0142-z.

includes
quadrant correlation, kendalls tau and spearman rho correlation
conversion to be consistent with normal correlation (however a transformed correlation matrix is not always positive semidefinite, see later article, reference below)
also includes asymptotic variance of correlation coefficients at normal distribution
(underestimates in small samples to various degrees in their MonteCarlo, so small sample correction, but the much larger distortion, bias, comes if there are outliers, or, I guess, non-normality in general as in variance hypothesis tests.)

looks useful but I don't know where they should go

Boudt, Kris, Jonathan Cornelissen, and Christophe Croux. 2011. “The Gaussian Rank Correlation Estimator: Robustness Properties.” Statistics and Computing 22 (2): 471–83. doi:10.1007/s11222-011-9237-0.
looks also at positive semidefinite problem of kendall and spearman after transformation to be consistent with normal correlation. recommendation: needs nobs > 3 * k_vars for kendall and nobs > 2 * k_vars for spearman.

gaussian rank correlation is consistent and asymptotically efficient (same asy variance as pearson) at normal distribution

not sure yet where to put this

something like for asy var, matching the examples in the two articles:


pearson
>>> rho = np.array([0.2, 0.8]); (1 - rho**2)**2
array([ 0.9216,  0.1296])

kendal
>>> rho = np.array([0.2, 0.8]); (1 - rho**2) * np.pi**2 * (1./9 - 4 / np.pi**2 * np.arcsin(rho / 2)**2)
array([ 1.01422912,  0.15092577])

quadrant
>>> rho = np.array([0.2, 0.8]); (1 - rho**2) * (np.pi**2 / 4 - np.arcsin(rho)**2)
array([ 2.32978184,  0.57870888])

spearman is more complicated with terms like this (if my odint does what I think it does)

integrate.odeint(lambda t, x: np.arcsin(np.sin(x) / (1 + 2 * np.cos(2 * x))), 0, t=np.linspace(0, np.arcsin(0.5), 11))

@josef-pkt
Copy link
Member Author

two more references with consistency factors for covariance

I'm using Table 1 from Croux and Haesbroeck as test reference numbers (I wrote my function initially partially by trial and error to get correct results in Monte Carlo).
Riani et al have a table numbers for tukey bisquare S-estimator, but I only skimmed their paper. (another paper refers to an article in a conference book for a table but I don't have access to it.)

They have more general expressions for elliptically symmetric distribution (based on g function)

Croux, Christophe, and Gentiane Haesbroeck. 1999. “Influence Function and Efficiency of the Minimum Covariance Determinant Scatter Matrix Estimator.” Journal of Multivariate Analysis 71 (2): 161–90. doi:10.1006/jmva.1999.1839.

Riani, Marco, Andrea Cerioli, and Francesca Torti. 2014. “On Consistency Factors and Efficiency of Robust S-Estimators.” TEST 23 (2): 356–87. doi:10.1007/s11749-014-0357-7.

@josef-pkt
Copy link
Member Author

(not sure what's the closest issue to this)
multivariate t-distribution used to estimate scatter matrix (I have it in some PR)

I just saw MASS has a function cov.trob: Covariance Estimation for Multivariate t Distribution
should be good for unit test and checking how they scale the scatter matrix, or get covariance matrix of endog.

@josef-pkt
Copy link
Member Author

josef-pkt commented Oct 4, 2023

"
In order to obtain a unique MLE we fix the scale of
the estimator by assuming that <trace of omega_inv> of the true covariance
matrix is known (or arbitrarily fixed)
"
before equ (6) p. 420 in

Soloveychik, Ilya, and Ami Wiesel. “Performance Analysis of Tyler’s Covariance Estimator.” IEEE Transactions on Signal Processing 63, no. 2 (January 2015): 418–26. https://doi.org/10.1109/TSP.2014.2376911.

the usual normalized Tyler's scatter matrix has trace(S) = p
So we can rescale the scatter matrix so that trace(cov) = sum(variances) for some robust or nonrobust variance estimates.

aside: I saw several articles for regularized or shrinkage Tyler scatter matrix (in analogy to regularizing/shrinking sample cov)

large overview of Tyler's scatter
Wiesel, Ami, and Teng Zhang. “Structured Robust Covariance Estimation.” Foundations and Trends® in Signal Processing 8, no. 3 (December 21, 2015): 127–216. https://doi.org/10.1561/2000000053.

and several more recent articles (I skimmed only a few parts)

Ashurbekova, Karina, Antoine Usseglio-Carleve, Florence Forbes, and Sophie Achard. “Optimal Shrinkage for Robust Covariance Matrix Estimators in a Small Sample Size Setting,” March 2021. https://hal.science/hal-02378034.

Goes, John, Gilad Lerman, and Boaz Nadler. “Robust Sparse Covariance Estimation by Thresholding Tyler’s M-Estimator.” The Annals of Statistics 48, no. 1 (February 2020): 86–110. https://doi.org/10.1214/18-AOS1793.

Hediger, Simon, Jeffrey Näf, and Michael Wolf. “R-NL: Covariance Matrix Estimation for Elliptical Distributions Based on Nonlinear Shrinkage.” IEEE Transactions on Signal Processing 71 (2023): 1657–68. https://doi.org/10.1109/TSP.2023.3270742.

Ollila, Esa. “Linear Shrinkage of Sample Covariance Matrix or Matrices under Elliptical Distributions: A Review.” arXiv, August 9, 2023. https://doi.org/10.48550/arXiv.2308.04721.

Ollila, Esa, Daniel P. Palomar, and Frédéric Pascal. “Shrinking the Eigenvalues of M-Estimators of Covariance Matrix.” IEEE Transactions on Signal Processing 69 (2021): 256–69. https://doi.org/10.1109/TSP.2020.3043952.

Zhang, Teng, and Ami Wiesel. “Automatic Diagonal Loading for Tyler’s Robust Covariance Estimator.” In 2016 IEEE Statistical Signal Processing Workshop (SSP), 1–5, 2016. https://doi.org/10.1109/SSP.2016.7551741.

another recent review article that looks good and is shorter than the Wiesel now mini-book

Taskinen, Sara, Gabriel Frahm, Klaus Nordhausen, and Hannu Oja. “A Review of Tyler’s Shape Matrix and Its Extensions.” In Robust and Multivariate Statistical Methods: Festschrift in Honor of David E. Tyler, edited by Mengxi Yi and Klaus Nordhausen, 23–41. Cham: Springer International Publishing, 2023. https://doi.org/10.1007/978-3-031-22687-8_2.

aside: Nordhausen is co-author or maintainer of several R packages that include extensions of Tyler's scatter estimation

@josef-pkt
Copy link
Member Author

New article with explicit scale estimate for Tyler's shape matrix

Ollila, Esa, Daniel P. Palomar, and Frederic Pascal. “Affine Equivariant Tyler’s M-Estimator Applied to Tail Parameter Learning of Elliptical Distributions.” arXiv, May 7, 2023. https://doi.org/10.48550/arXiv.2305.04330.

brief skimming:
It looks like it's just the average of the inverse weights, see equ. (6)

I can try it out in PR #8129

@josef-pkt
Copy link
Member Author

in #9227 I use an M-scale to scale the shape matrix with det(shape)=1, with consistency, scale_bias at normal distribution.
In CovS it is part of the definition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant