-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
candidates argument for FactorizedTopK #688
Comments
Top K categorical accuracy is the percentage of records for which the (non-zero) targets are in the top K predictions. So, if a user clicked or rated a movie positively, and that movie has the 11th highest score in the model's predictions for that user, then it wouldn't qualify for the top 10 categorical accuracy, but it would qualify for the top 25 categorical accuracy, for example. |
Thanks for your answer, @rlcauvin |
I use unique candidates in my retrieval models. I suppose specifying candidates with duplicates could result in some of the duplicates appearing more than once in the top K recommendations for a user, or in implicit negatives skewing the model. I haven't examined the MovieLens dataset, but I don't see any good reason that it should contain duplicates in the movies file. |
Hi,
metrics = tfrs.metrics.FactorizedTopK( candidates=movies.batch(128).map(movie_model) )
I'm trying to figure out how 'candidates' argument works for FactorizedTopK metric from the retrieval tutorial.
The tutorial uses 'movies' dataset, and I found the dataset includes some duplicates.
I tested using an array of unique movies for that argument and I got different accuracy compared to using 'movies' dataset.
Can anyone help me to understand how the candidates are used to calculate accuracy and how I should create this from the dataset I have (order of items and batch size)?
The text was updated successfully, but these errors were encountered: