Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement sparse PCA/dictionary-encoding #254

Open
ExpandingMan opened this issue Nov 18, 2022 · 4 comments
Open

implement sparse PCA/dictionary-encoding #254

ExpandingMan opened this issue Nov 18, 2022 · 4 comments

Comments

@ExpandingMan
Copy link
Contributor

I'm interested in implementing sparse dictionary learning and sparse PCA. These techniques are implemented in scikit-learn, see here, references therein.

I might take this on if I deem it a sufficiently short project.

@joshday
Copy link
Owner

joshday commented Nov 18, 2022

Looks cool! If you go for it, let me know if you have any questions.

@ExpandingMan
Copy link
Contributor Author

Alright, I've read those papers this morning and have some thoughts.

Both of these methods essentially amount to a matrix factorization of the training data $UV$ with various constraints on the form of these matrices, in particular that they be sparse (or just $V$ be sparse). As far as I can tell the only real difference between these two methods is that "sparse PCA" constraints $V^{T}V=1$ whereas "dictionary learning" provides no such guarantees. The author of the latter imply this is an advantage, though at first glance it seems to me like a disadvantage. The dictionary learning, at least in that particular paper, is explicitly an online method whereas it's not obvious to me how to do sparse PCA without a full batch. My first guess would probably be to see if I could adapt the dictionary learning algorithm presented in this paper to the orthogonality constraint, but I've not thought this through.

So, I'm still seriously considering implementing dictionary learning as it should be relatively straightforward. The only caveat as far as a PR to OnlineStats.jl goes is that it necessarily involves an $\ell_1$ regularized regression. I think it would be wiser to use Lasso.jl or something similar for this rather than attempt to implement it from scratch, which would necessitate adding a dependency.

What dictionary learning would offer over anything that's currently available in OnlineStats.jl is a dimensionality reduction technique with a sparse transformation matrix, which would be useful since one can easily wind up with a fairly enormous matrix for high-dimensional training data.

Therefore, I think, assuming I follow through and do this, I will create a new package which has OnlineStats.jl as a dependency. Then, if and when I finish, I will post a link here and you can review, and if you'd like to merge it into OnlineStats.jl I'd be happy to oblige.

@joshday
Copy link
Owner

joshday commented Nov 19, 2022

Sounds like a good plan to me! FYI you may be able to get away with just OnlineStatsBase as a dependency.

@ExpandingMan
Copy link
Contributor Author

Alright, I lost a ton of interest in this method as I have realized that it probably really sucks for dimensionality reduction in most cases. It looks like a really good compression method, but since it achieves its compression only by sparsity you tend to leave with more dimensions than you came in. It might work well for dimensional reduction in some cases if the initial conditions are chosen appropriately, but I haven't discovered how to do that yet.

Anyway, I thought I should note here that I have OnlineSparseLearning.jl up, which seems to be basically working but is certainly not usable yet. I may or may not finish this depending on my needs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants