You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed something weird when I was using this function to split my data into train and test set
I had a distribution of users and number of times they have rated items looking something like this:
Number of ratings given
Number of users
1
6000
2
3000
3
200
4
30
The documentation states that users > K ratings have one of their rating put into test set, and the others in the train set.
So when I used the function with k = 1 I was expecting to get 3230 records in the test set, but only got 230
I have a guess that it was done this way to prevent situation where user with 2 ratings gets only 1 rating in the train set, because If I understand it correctly users with 1 rating are useless for training? Please verify
The text was updated successfully, but these errors were encountered:
I noticed something weird when I was using this function to split my data into train and test set
I had a distribution of users and number of times they have rated items looking something like this:
The documentation states that users > K ratings have one of their rating put into test set, and the others in the train set.
So when I used the function with k = 1 I was expecting to get 3230 records in the test set, but only got 230
So my question is shoudln't this line then
implicit/implicit/evaluation.pyx
Line 189 in 6491663
look like this
or this
instead ?
I have a guess that it was done this way to prevent situation where user with 2 ratings gets only 1 rating in the train set, because If I understand it correctly users with 1 rating are useless for training? Please verify
The text was updated successfully, but these errors were encountered: