Skip to content

Recommendation models that use binary rather than floating point operations at prediction time.

License

Notifications You must be signed in to change notification settings

maciejkula/binge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Binary Representations in Recommendations

This repository contains an implementation of recommendation models that use binary rather than floating point operations at prediction time. This makes them much faster (and less memory intensive), but also less accurate.

The details are in the paper and in the slides.

Implementation

The model is implemented in PyTorch for fitting, and C for prediction (using AVX2 SIMD operations).

Results

Dimension   MRR Binary MRR MRR ratio   PPMS Binary PPMS PPMS ratio Memory use ratio
      32 0.077      0.059     0.768 87,758     264,146      3.010            0.091
      64 0.075      0.066     0.878 45,992     220,833      4.802            0.062
     128 0.076      0.071     0.935 23,870     166,252      6.965            0.047
     256 0.078      0.071     0.908 12,284     190,131     15.477            0.039
     512 0.080      0.073     0.912   6074     122,637     20.190            0.035
    1024 0.080      0.075     0.936   3056      61,301     20.056            0.033

The table above summarises the results of comparing floating point and binary models on the Movielens 1M dataset. MRR (mean reciprocal rank) ratio denotes the fraction of the real-valued MRR the binary model achieves for that dimensionality; PPMS (predictions per millisecond) ratio denotes how many more predictions the binary model can compute per millisecond.

It appears that continous models retain good accuracy as latent dimensionality decreases, binary models' representational power sharply deteriorates. Moving from the 1024 to 32 dimensions in the continuous model implies a 29 times increase in prediction speed at the expense of a modest 4% decrease in accuracy. This compares favourably to switching to binary representations: moving to a 1024-binary dimensional representation implies a sharper accuracy drop at 6% in exchange for a smaller 20 times increase in prediction speed. Moving to 32 dimensions yields a further speed gains at 86 times, but at the cost of a considerable loss of accuracy at 26%.

The results suggest that latent recommender models have good representational power even at low latent dimensionalities. This advantage is lost when using binary embeddings, which need to be high-dimensional to achieve comparable accuracy. At the same time, binary representations' speed advantage is only evident at high dimensions. The attractiveness of binary representations is tightly coupled with high-dimensional models where a single prediction requires many floating point operations, such as convolutional neural networks. Relatively compact latent factor models do not fall into this category.

About

Recommendation models that use binary rather than floating point operations at prediction time.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published