Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Support chronological split for next-basket recommendation #604

Open
lthoang opened this issue Mar 22, 2024 · 4 comments
Open
Assignees

Comments

@lthoang
Copy link
Member

lthoang commented Mar 22, 2024

Description

Dataset can be split by the order of basket in user sequence.
NextBasketEvaluation now supports splitting data by user sequence. The data will be split based on user_id, meaning users in training, validation, and testing do not overlap.

In this requested feature, users in validation and test should be available in training data.

Expected behavior with the suggested feature

We can keep the last basket in sequence as testing, second last basket as validation, and the rest as training. For example, a sequence b1 b2 b3 b4 can be split into train b1 b2, validation b3, and test b4 while the history baskets for validation is b1 b2 and the history baskets for test is b1 b2 b3.

@lthoang lthoang self-assigned this Mar 22, 2024
@tqtg
Copy link
Member

tqtg commented Mar 27, 2024

Do we need it to preserve the order of b3 for validation and b4 for testing, or it can be arbitrary?

@lthoang
Copy link
Member Author

lthoang commented Mar 27, 2024

Do we need it to preserve the order of b3 for validation and b4 for testing, or it can be arbitrary?

@tqtg I think we should preserve the order of b3 and b4 for validation and testing as validation data can be included in the historical baskets for test.

We can also add another option to specify whether we should keep the order or not. However, we need a motivating example for the arbitrary case.

@tqtg
Copy link
Member

tqtg commented Mar 27, 2024

Using validation data for testing is also a debatable topic. I think it's good to include a couple of paper references as motivation for this feature.

@lthoang
Copy link
Member Author

lthoang commented Mar 27, 2024

Using validation data for testing is also a debatable topic. I think it's good to include a couple of paper references as motivation for this feature.

MMNR: Multi-view Multi-aspect Neural Networks for Next-basket Recommendation, code https://github.com/Hiiizhy/MMNR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants