Skip to content
This repository has been archived by the owner on Jul 19, 2022. It is now read-only.

under what condition will the algorithm not converge? #23

Open
IvonDing opened this issue Nov 15, 2018 · 5 comments
Open

under what condition will the algorithm not converge? #23

IvonDing opened this issue Nov 15, 2018 · 5 comments

Comments

@IvonDing
Copy link

I am training a group lasso model with time series data, which means rho=0 in my model. In every month, I train the model with the data in the past 24 months. The parameter alpha is very stable, around 0.0005. However, in several months, when I move ahead by one month, which means that I replace the data of month t-24 with the data of the latest month, the model do not converge. In this case, I need a far more larger parameters, which is aroung 0.002 to make the model converge. I am wondering why this happens. Because I only replace a small portion(less than 5%) of the training data, but the parameters change sharply. This may cause instability of the model, as in the previous month, I may choose 30 features, however in the current month, I can only choose 16 features.

Traceback (most recent call last):
File "D:\Program Files\Python\Python36\lib\site-packages\IPython\core\interactiveshell.py", line 2963, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 4, in
model = model.fit(p, y)
File "D:\Program Files\Python\Python36\lib\site-packages\sparsereg\model\group_lasso.py", line 33, in fit
max_iter=self.max_iter, rtol=self.tol)
File "D:\Program Files\Python\Python36\lib\site-packages\sparsereg\vendor\group_lasso\group_lasso.py", line 92, in sparse_group_lasso
delta = linalg.norm(tmp - w_new[group])
File "D:\Program Files\Python\Python36\lib\site-packages\scipy\linalg\misc.py", line 137, in norm
a = np.asarray_chkfinite(a)
File "D:\Program Files\Python\Python36\lib\site-packages\numpy\lib\function_base.py", line 1233, in asarray_chkfinite
"array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

@Ohjeah
Copy link
Owner

Ohjeah commented Nov 15, 2018

Seems like some of your data are infinite. Did you clean your data before fitting the model?

@IvonDing
Copy link
Author

IvonDing commented Nov 15, 2018

Seems like some of your data are infinite. Did you clean your data before fitting the model?

Yes,I have transformed my X to quantiles between 0 to 1, and y is also winsorized. I debugged the code, and found that in the loop of solving w, if I set a small alpha, w would be very large after several steps and the go to infinite. That's how the error 'array must not contain infs or NaNs' comes. So under what condition do this happen?
I have 500 features, 20 groups and 20000 samples which is sufficiently large, I think even when alpha is close to zero , I can also get a optimal w. So what might be the problem. Is there any requirement for the input data in your algorithm?

@IvonDing
Copy link
Author

I have found the problem. The stepsize in the inner group loop is not properly set. The method will converge if the stepsize is smaller. It seems that your method do not strictly follow the paper of Noah Simon.

@Ohjeah
Copy link
Owner

Ohjeah commented Nov 19, 2018

Thanks for your investigations. The group lasso implementation is actually vendored. Mind sharing how you fixed the problem? We might want to have our own group lasso implementation.

@IvonDing
Copy link
Author

IvonDing commented Dec 4, 2018

Thanks for your investigations. The group lasso implementation is actually vendored. Mind sharing how you fixed the problem? We might want to have our own group lasso implementation.

I just set a smaller stepsize and increase the increase iterations. This is a very simple adjustment which sacrifices efficiency. I think a better solution is to change the stepsize dynamically according to the paper.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants