Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient Bayesian ridge regression #14

Open
flaxter opened this issue Nov 3, 2016 · 2 comments
Open

Efficient Bayesian ridge regression #14

flaxter opened this issue Nov 3, 2016 · 2 comments

Comments

@flaxter
Copy link
Contributor

flaxter commented Nov 3, 2016

Using 100 KDE features and all the categorical variables, I end up with a dataset that's 840x6578 so I'm inclined to do ridge regression. I tried to implement it in Stan but it's taking forever to sample the 6578 parameters, I think because there's so much correlation among the covariate. One trick that speeds things up greatly is to do a QR decomposition in R (see stan-dev/rstanarm#30) and then learn 840 parameters based on an orthogonal design matrix. This works great, but I'm not sure how to recover the original 6578 parameters. But in any case, this might all be moot, as even with 6578 parameters I don't really know how to read off something like effect sizes / percent of variance explained. So now I'm thinking I should just do good old fashioned forward stepwise regression. But even that is slow with Stan--i.e. at the moment I've got a matrix that 840x100 because I want to use the 100 KDE features for the age variable.

So should I forget Stan? Or do something else? Interestingly, the GP regression perspective on this is efficient in Stan: considering a linear kernel, the covariance K becomes 840x840 and you just sample the observations:

y ~ N(0, K + \sigma^2 I)

But again this doesn't give us a clear interpretation of which are the important variables. So maybe I should really just do group lasso? I guess there are some group lasso packages in R to try...

@flaxter
Copy link
Contributor Author

flaxter commented Nov 3, 2016

update: found a bug with the KDE features, maybe this will work better after I fix that

@flaxter
Copy link
Contributor Author

flaxter commented Nov 3, 2016

OK, it's better but still pretty damn slow. @yxwang1988 what do you think about implementing ORF?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant