-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better documentation of missing value imputation #484
Comments
Hi @Kiord—When the rank is full, SVD is able to fully represent the matrix. As a consequence of this, it can represent whatever values are filled in as missing values, and there is no way for SVD to impute those values. Imputation occurs as a consequence of SVD being an approximation of the matrix, which can only happen when it is an incomplete representation. You should find that the imputed values change if the rank is less than full, though. |
Hi, thank you. I was unsure if this behavior was wanted. |
No worries. Happy to help! |
Thanks @aarmey - do you think it would make sense to add a Notes section on imputation in the docstring? |
I am not sure if this should be user-facing. If a user wants to impute by SVD, there are better options in |
I agree -- maybe just a short note section in the docstring about what we mean by mask etc. We can also add separately a short section in the user guide about missing data with our tensor methods in general -- I guess there is no such thing as too exhaustive a documentation :) |
Hello @JeanKossaifi, I can do try to do that. What I understood is that |
Hi @Kiord—That is right. There is nothing wrong with using svd_interface for imputation, we just have not added a lot of basic functionality one might expect. For example, it only allows you to set a constant number of iterations, rather than running until some convergence condition. We also do not check the inputs very carefully to ensure they are reasonable (such as being lower rank than the data). |
Describe the bug
I am trying to impute values with
svd_interface
in a matrix, but the mask values don't seem to impact the result.I am using
n_eigenvecs=matrix.shape[0]
(no compression / no data loss)Steps or Code to Reproduce
Expected behavior
The values where
mask == 0
should differ frommatrix
torecon
Actual result
The imputing code is as follows:
I think the result
U, S, V
does not depend onmask
:matrix
transformed in svd form (no loss data)matrix
is equal toU @ tl.diag(S) @ V
, this is equivalent tomatrix=matrix
U, S, V
(no loss data)matrix
is equal toU @ tl.diag(S) @ V
, this is equivalent tomatrix=matrix
U, S, V
(no loss data)In the end
mask
has no effect on the result.Versions
Windows-10-10.0.22000-SP0
Python 3.10.4 | packaged by conda-forge | (main, Mar 30 2022, 08:38:02) [MSC v.1916 64 bit (AMD64)]
NumPy 1.21.5
SciPy 1.8.0
The text was updated successfully, but these errors were encountered: