Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent convergence checks and options #481

Open
aarmey opened this issue Jan 20, 2023 · 2 comments
Open

Consistent convergence checks and options #481

aarmey opened this issue Jan 20, 2023 · 2 comments

Comments

@aarmey
Copy link
Contributor

aarmey commented Jan 20, 2023

Related to #470, I have been wondering if we should make the convergence handling consistent and reusable across methods. Currently, methods handle convergence checks in slightly different ways—some have an absolute threshold (PARAFAC2), or a relative threshold set according to the absolute difference in error (CP) or fractional difference in error (PARAFAC2). Finally, there is repeated code for storing a history of the error, checking these thresholds, printing the error at each iteration, etc.

One possibility would be to have an error tracker class, that is instantiated by any method. This class takes the tolerances on instantiation and handles storing a history of the errors. It has one method that is called on each iteration of a method, and returns a boolean as to whether the method should continue. Printing the per-iteration error is handled by this method (with a verbose flag).

My impression is that this would clean up a lot of code and force methods to have consistent convergence criteria. Note that the method itself would still have to handle calculating the absolute error on each iteration (so efficiencies like reuse of MTTKRP in CP can still occur).

@JeanKossaifi
Copy link
Member

Thanks @aarmey. I am a little conflicted by such refactoring. Overall I like to have each algorithm being clearly readable from source and not spread out important bits in different files. Convergence can be pretty method specific as you mentioned, so I think it makes sense to do it in each file. Plus, the code to store error, and check threshold is pretty trivial so I don't think there is a need to centralize that. I often use scikit-learn as reference, they manage really well the balance features/complexity despite the growing codebase.

I almost find the callback option already overkill in some ways but if we really want to refactor I guess we could use a callback to let users store the error history and not do that ourselves in the also.

In any case, we could probably use some harmonization of the error metrics across the codebase as you're saying.

@aarmey
Copy link
Contributor Author

aarmey commented Jan 20, 2023

Indeed, I was worried this might be too clever and impact readability. Agreed that we can do some work to unify the convergence options, but leave it at that.

@aarmey aarmey changed the title Unified and consistent error/convergence tracking Consistent convergence checks and options Jan 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants