Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Allow running evaluation on validation set during training or right after training is done #613

Open
lthoang opened this issue Apr 3, 2024 · 4 comments

Comments

@lthoang
Copy link
Member

lthoang commented Apr 3, 2024

Description

This is a breaking change. In many training/validation/test pipeline, the evaluation on validation data happens during training (for monitoring/model selection).
The current version of cornac evaluates validation set similar to test data.

Expected behavior with the suggested feature

  • Evaluate validation periodically after n epochs and report the validation results.
  • Some model (e.g. MMNR) uses different external data for validation and test, which requires a way to distinguish between validation evaluation and test evaluation.

Other Comments

@tqtg
Copy link
Member

tqtg commented Apr 3, 2024

Couple of things:

  • if it's only for performance reporting purpose on validation set, we can implement it as part of the model training loop stead of part of general pipeline
  • for MMNR model, what does it mean by using different external data for validation and test? can we have a specific example?

@lthoang
Copy link
Member Author

lthoang commented Apr 4, 2024

for MMNR model, what does it mean by using different external data for validation and test? can we have a specific example?

@tqtg, If we look closely at this function. MMNR uses different history matrices for train data, validation data, and test data.

My idea is to breakdown the evaluation pipeline into train, validation, and test to reduce the redundancy of the current implementation (currently, cornac allows manipulating val_set along with train_set inside fit function, it may cause re-evaluate val_set once or twice inside score or rank functions)

@tqtg
Copy link
Member

tqtg commented Apr 4, 2024

We provide models with both val_set and train_set so that it can perform early stopping, hyper-parameter optimization, or any kind of trade-off for model selection inside the training loop. I don't get your very last sentence about score and rank functions.

@lthoang
Copy link
Member Author

lthoang commented Apr 5, 2024

@tqtg let's say we evaluate val_set inside fit for early stopping/monitoring. After the training is done and we perform evaluation on VALIDATION, the model has to do inference on the val_set again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants