Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option n_modes="all" to perform the full decomposition #158

Open
nicrie opened this issue Mar 9, 2024 · 2 comments
Open

Add option n_modes="all" to perform the full decomposition #158

nicrie opened this issue Mar 9, 2024 · 2 comments
Labels
new feature New feature or request

Comments

@nicrie
Copy link
Collaborator

nicrie commented Mar 9, 2024

As an aside it would be nice to have an option like n_modes="all", which figures out the rank of the data early on and sets n_modes = rank. When I want to do experiments like this I usually just first try to fit the model with a million modes and then get the rank from the error message and plug it in.

Looks like sklearn.PCA does this as a default: n_components = None

Originally posted by @slevang in #156 (comment)

@nicrie nicrie added the new feature New feature or request label Mar 9, 2024
@nicrie
Copy link
Collaborator Author

nicrie commented Mar 9, 2024

Agree with you @slevang , that'd be nice. Currently, the rank is only computed when we fit the Decomposer

n_coords1 = len(X.coords[dims[0]])
n_coords2 = len(X.coords[dims[1]])
rank = min(n_coords1, n_coords2)

The first opportunity to compute the rank is probably already within the Stacker of the Preprocessor. since we then know the final size of the matrix.

I don't know what you think about the default value - Personally, I mostly prefer to have a fast rather than exact decomposition. If we were going for n_modes="all" as a default, it would always trigger the full SVD decomposition.

@slevang
Copy link
Collaborator

slevang commented Mar 9, 2024

Yep, agree with all that. Fine to leave the default alone, and I was thinking around the same place in the preprocessors to check the rank. Just a question then of how to propagate the information back to the rest of the class, haven't looked at the details yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants