Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Smoothed quantile regression, smooth approximating functions #9182

Open
josef-pkt opened this issue Mar 27, 2024 · 3 comments
Open

ENH: Smoothed quantile regression, smooth approximating functions #9182

josef-pkt opened this issue Mar 27, 2024 · 3 comments

Comments

@josef-pkt
Copy link
Member

josef-pkt commented Mar 27, 2024

related issue and application to penalized estimation #5350 #5350 (comment)

There are many smooth approximations in the literature for the quantile regression objective function, "check" function.
I tried before using RLM with M-quantiles, but my experiments with it did not have parameter estimates close to the ones from QuantReg.

It will be more useful to look at the literature that explicitly uses smoothed quantile regression.

The main advantage would be being able to use generic optimization which makes extension easier to implement, e.g. penalized estimation or nonlinear mean functions.
There might also be a possible speedup if we can use faster optimization algorithms than the irls, (although for the simple linear case we should still get Koenker's interior point algorithm.)

possible implementation
QuantRegSmoothed, new class with generic optimization in fit, several possible objective functions that differ in approximating function, possibly/optionally return QuantRegResults with kernel based cov_params as in QuantReg.
It should properly be based on a (non-existing) GenericMEstimator class (version with objective function; GenericMomEstimator #7436 would be separate.)
A quick prototype could be implemented as subclass of GenericLikelihoodModel (or as another RLM norm just to see whether that works. That will not be immediately useful for penalized estimation.)

references in addition to those in comment in #5350

Mkhadri, Abdallah, Mohamed Ouhourane, and Karim Oualkacha. 2017. “A Coordinate Descent Algorithm for Computing Penalized Smooth Quantile Regression.” Statistics and Computing 27 (4): 865–83. https://doi.org/10.1007/s11222-016-9659-9.

Nychka, Doug, Gerry Gray, Perry Haaland, David Martin, and Michael O’Connell. 1995. “A Nonparametric Regression Approach to Syringe Grading for Quality Improvement.” Journal of the American Statistical Association 90 (432): 1171–78. https://doi.org/10.2307/2291509.

Oh, Hee-Seok, Thomas C. M. Lee, and Douglas W. Nychka. 2011. “Fast Nonparametric Quantile Regression With Arbitrary Smoothing Methods.” Journal of Computational and Graphical Statistics 20 (2): 510–26. https://doi.org/10.1198/jcgs.2010.10063.

Ouhourane, Mohamed, Yi Yang, Andréa L. Benedet, and Karim Oualkacha. 2022. “Group Penalized Quantile Regression.” Statistical Methods & Applications 31 (3): 495–529. https://doi.org/10.1007/s10260-021-00580-8.

Yoshida, Takuma. 2023. “Asymptotics for Penalized Spline Estimators in Quantile Regression.” Communications in Statistics - Theory and Methods 52 (14): 4815–34. https://doi.org/10.1080/03610926.2013.765477.

@josef-pkt
Copy link
Member Author

josef-pkt commented Mar 27, 2024

checking some parts:

Nychka's (in Oh et al 2011) approximation has continuous first but discontinuous (jump at zero) second derivative if q is not 0.5.

The same should be true for all M-quantile norms.

The problem is that there are (by construction) observations with zero residuals in quantile regression, so being at the discontinuity is not a zero probability event. But we might not get zero residuals with positive probability in the smooth approximation case.
In Oh et al article, observation with zero residual are also treated as positive (second derivative is q)

AFAICS, Nychka's base_norm is the same as the HuberT norm with rho divided by threshold c (t in HuberT).
That means that MQuantileNorm with HuberT is the same as Nychka's (scaling does not affect estimates).

update
checking the details of the QuantReg weight function again.
It looks like with the adjustment around zero, it is exactly the same as Nychka and (rescaled) HuberT with threshold = 1e-6.
However, the weights are rescale by 1 / (q * (1 - q))

            #resid = np.where(resid < 0, q * resid, (1-q) * resid)
            resid = np.where(resid < 0, resid / (1  - q), resid / q)

resid in this code is used as inverse weight for irls

update to update I'm not committing this change right now. There might be speed differences that need more investigation.
I have large fluctuations in timing if I only check a single simulation case.
<end update to update>

weights following from M-norm derivation i.e. w = psi / u
we have
weight = (1 - q) / abs(resid) if resid < c and
weight = q / abs(resid) if resid >= c

while current QuantReg fit uses

weight = 1 / q / abs(resid) if resid < c and
weight = 1 / (1 - q) / abs(resid) if resid >= c

with equivalence 1 / q * (q * (1 - q)) = (1 - q) because weights in WLS are equivalent if only scaling differs.

I don't find a reference for the IRLS computation.
Origin of implementation goes back to josef-pkt@94fa66c
which had bugs in the neighborhood of zero.

So, I don't see a problem rewriting this more in the style of irls in RLM.
That is, use weights instead of inverse weights that are derivation following from Nychka's rho function.

This means that there should not be a difference between QuantReg irls estimation and RLM with HuberT M-quantiles, except for the choice of threshold.
Also computational speed should be approximately the same.
And we have uniqueness of the estimator (for large, non-degenerate samples) based on (non-redescending) HuberT M-estimator properties.

update
We compute scale in default RLM.
To match quantile regression with M-quantiles, I guess we should fix scale=1

@josef-pkt
Copy link
Member Author

josef-pkt commented Mar 27, 2024

Looking at an old notebook again, try_mquantile_norm-Copy1.ipynb

I don't see why I remember that M-quantile and quantile regression don't match up for parameter estimates.
in this example, the match in parameters is good. (standard errors have very large difference and RLM cov_params is not appropriate)
Uses simulated data with quadratic heteroscedasticity.

mod = quantreg('dens ~ temp + I(temp ** 2.0)', df)
res = mod.fit(q=.25)
print(res.summary())

t_eps = 1e-10 #1e-6
mq_norm = MQuantileNorm(0.25, norms.HuberT(t=t_eps))
mod_rlm = RLM(y, xx, M=mq_norm)
res_rlm = mod_rlm.fit()
print(res_rlm.summary())
                         QuantReg Regression Results                          
==============================================================================
Dep. Variable:                   dens   Pseudo R-squared:               0.1102
Model:                       QuantReg   Bandwidth:                       4.893
Method:                 Least Squares   Sparsity:                        14.56
Date:                Tue, 29 Nov 2022   No. Observations:                  200
Time:                        12:17:55   Df Residuals:                      197
                                        Df Model:                            2
==================================================================================
                     coef    std err          t      P>|t|      [0.025      0.975]
----------------------------------------------------------------------------------
Intercept         -0.8284      0.692     -1.197      0.233      -2.193       0.536
temp               0.9598      0.198      4.856      0.000       0.570       1.350
I(temp ** 2.0)    -0.7118      0.101     -7.023      0.000      -0.912      -0.512
==================================================================================
                    Robust linear Model Regression Results                    
==============================================================================
Dep. Variable:                      y   No. Observations:                  200
Model:                            RLM   Df Residuals:                      197
Method:                          IRLS   Df Model:                            2
Norm:                   MQuantileNorm                                         
Scale Est.:                       mad                                         
Cov Type:                          H1                                         
Date:                Tue, 29 Nov 2022                                         
Time:                        12:17:55                                         
No. Iterations:                    50                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const         -0.8284   1.51e-08   -5.5e+07      0.000      -0.828      -0.828
x1             0.9598   4.34e-09   2.21e+08      0.000       0.960       0.960
x2            -0.7118   2.16e-09   -3.3e+08      0.000      -0.712      -0.712
==============================================================================

checking PR, comments already looked at this
#3183 (comment)

@josef-pkt
Copy link
Member Author

josef-pkt commented Mar 28, 2024

Chen, Colin. 2007. “A Finite Smoothing Algorithm for Quantile Regression.” Journal of Computational and Graphical Statistics 16 (1): 136–64. https://doi.org/10.1198/106186007X180336.

equ. (2.3) they use a M-quantile huber function that has asymmetric threshold in positive and negative parts.
And it has continuous derivatives at zero, rho(u) in center part is proportional to u**2 with same factor on both sides of zero.

This does not fit the M-quantiles pattern, which have different slopes at 0+ and 0-.
I might have tried something like this before learning about M-quantiles.

They shrink the threshold towards zero during optimization with check when to stop shrinking.

a similar one
also with asymmetric thresholds and updating of thresholds, uses irls
psi function is piecewise linear with kink at zero, i.e. discontinuous second derivative of rho at zero.

Muggeo, Vito M.R., Mariangela Sciandra, and Luigi Augugliaro. 2012. “Quantile Regression via Iterative Least Squares Computations.” Journal of Statistical Computation and Simulation 82 (11): 1557–69. https://doi.org/10.1080/00949655.2011.583650.

Those approximation could fit into the M-quantile setup, but the base_norm also needs to depend on the quantile, while in the standard M-quantiles the base_norm is fixed, independent of quantiles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant