Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QUIC alternative for adaptive graphical lasso (esp. with weights enforcing zero-edges) #74

Open
mnarayan opened this issue Dec 11, 2016 · 9 comments

Comments

@mnarayan
Copy link
Member

mnarayan commented Dec 11, 2016

Optimization algorithm (alternative to QUIC) that significantly speeds up any adaptive variant of graphical lasso where some entries of precision matrix $\Theta$ are enforced to be zero. It takes these zero contraints and converts them into constraints on sets of pathways that make solving the graphical lasso criterion much faster.

Open question: Can all zero constraints be transformed into a pathway formulation? Need to work this out.

A cython/python implementation available here
https://github.com/maximsch2/pglasso

References

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4497513/
Path coding penalties: http://lear.inrialpes.fr/people/mairal/resources/pdf/path-flow.pdf

@jasonlaska
Copy link
Member

This would be awesome (and fairly easy to integrate) if it's compatible w the theory. We could also add it as an option for the second step estimator. One thing to keep in mind, in two of the adaptive penalty cases, we use QuicGraphLassoCV to obtain a better penalty scale (https://github.com/skggm/skggm/blob/develop/inverse_covariance/adaptive_graph_lasso.py#L123-L131); would we effectively have to build a CV version of this algorithm as well?

@mnarayan
Copy link
Member Author

mnarayan commented Dec 11, 2016 via email

@jasonlaska
Copy link
Member

jasonlaska commented Dec 11, 2016

Without a path mode, we can just use sklearn's GridSearchCV after we wrap it as an InverseCovarianceEstimator. Not as optimized but works.

@jasonlaska
Copy link
Member

*GridSearchCV

@maximsch2
Copy link

I would be interesting in helping to port PathGL into skggm.
Answering the questions raised:

  • There is no explicit "path" mode implemented right now, but warm-starting should probably work fine for this purpose.
  • Adding weighting should also be possible, but that might require going deeper into the code.
  • Converting from zero constraints to pathway constraints optimally sounds non-trivial, but there might be some heuristics that can be applied to quickly get an approximate set of pathways (with additional constraints to set extra values to zero if not covered by pathway constraint). Are you interested in purely inferred pathways in this fashion or user-specified pathways are also an option?

@mnarayan
Copy link
Member Author

@maximsch2 This would be fantastic! If I recall correctly, your paper mentioned solving the glasso problem one pathway at a time by wrapping around existing solvers like QUIC.

As a first step, given an initial 0/1 adjacency matrix (which might be inferred from the data or side information available to the user), it would be nice to transform this into an approximate set of pathway constraints.

  • Ideally with the option not to enforce extra zeros if not covered by pathway contraints if the main goal was to eliminate pathways that we don't expect to be permissable. But if the main goal is to speed up an adaptive/relaxed glasso then we would want to force 0/1 pattern exactly.

  • If we do this as a wrapper around calls to QUIC then I imagine we can leverage existing path mode as well.

I propose using an API similar to that of AdaptiveGraphLasso, i.e. as a wrapper around a lower level estimator like QUICGraphLasso. This way if we had other solvers/estimators in the future, those can be easily incorporated.
https://github.com/skggm/skggm/blob/develop/inverse_covariance/adaptive_graph_lasso.py#L13

If you are interested in contributing other solvers like the dp-glasso that would be welcome too. Feel free to open an issue/PR for this. There are fast boxQP implementations that we could great cython bindings for.

@maximsch2
Copy link

Yeah, in theory, any existing solver should work, but it a) needs to be adapted to support shifted penalty b) support warm-start, as pglasso is an iterative algorithm essentially doing message-passing over the pathways.

As the DP-GLasso we are using is warm-startable, the entire thing is warm-startable as well, so getting a path over lambda values efficiently should not be an issue.

I think the main serious issue is recovering the pathway structure given the 0/1 adjacency matrix. Generally speaking, the natural structure is to just have a pathway for each pair of genes, but that will likely not provide any improvement in computation in the current implementation (but it might be actually an interesting problem to see if it is possible to implement it that way - if all pathways are just 2x2 pathways, then it should be possible to solve the pathway-local glasso problem in closed form, so the only computational complexity will come from iteratively combining all those solutions).
As a next step, finding cliques in the connectivity graph gets us to the NP-complete territory, so solving it exactly is not an option likely.

For boxQP, do you have pointers for the implementations? I'm thinking integrating shifted DP-Glasso implementation might be a good first step. Shifted DP-Glasso does require shifted boxQP, though, so existing implementations might not be applicable.

@mnarayan
Copy link
Member Author

mnarayan commented Jan 4, 2018

@maximsch2 It seems indeed the next best step is to create a solver for the Shifted DP-Glasso similar to the current QUICGraphLasso API with a single lambda or a path mode via warm starts.

Some things to note, it would be nice to support weighted L1 penalties as is supported for QUIC. This will all the adaptive, model averaging options will be open to the shifted dp-glasso solver as well. What do you think about splitting it up such that the weights only apply to the pathway under consideration and not the shift term as in \lambda |W .* \Theta|_{1} + |\Delta |_{1} versus applying them together \lambda |W .* (\Theta|_{1} + \Delta) |_{1}?

We could perhaps also look into how easily we might incorporate a shifted L1 penalty option for QUIC too. It would been supporting a shift variable and incorporating that into L1 penalty calculations in the C implementation of QUIC. But this can go under a separate issue.

I think the main serious issue is recovering the pathway structure given the 0/1 adjacency matrix.

Regarding recovering the pathways itself, how about we go with the assumption that pathways are available and provided for the function to begin with? I have some ideas on extracting pathway information in the context of neuroimaging, but it looks like going from 0/1 adjacency to pathways is something we might want to implement separately as what constitutes a reasonable approximation might be domain specific.

@mnarayan
Copy link
Member Author

mnarayan commented Jan 4, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants