Skip to content

Open source matrix factorization recommender for sparse matrices

License

Notifications You must be signed in to change notification settings

jeh0753/sparseMF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sparseMF

SparseMF is a matrix factorization recommender written in Python, which runs on top of NumPy and SciPy. It was developed with a focus on speed, and highly sparse matrices. The package is available via pip.

Use SparseMF if you need a recommender that:

  • Runs quickly using explicit recommender data
  • Supports scipy sparse matrix formats
  • Retains the sparsity of your data during training

Algorithm

This repo introduces two sparse matrix factorization algorithms. The algorithms were originally introduced by Trevor Hastie et al. in a 2014 paper "Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares" as an extention to SoftImpute, which was introduced in 2009. A sparse implementation of each of these algorithms is introduced here. Both borrow from the FancyImpute python dense implementation of the 2009 SoftImpute algorithm. With large, sparse matrices, this version is significantly faster at predicting ratings for user/item pairs. To learn more about the differences between the two algorithms, read Trevor Hastie's vignette.

Getting Started

SparseMF is simple to use. First, install the package via pip:

pip install sparsemf

Next, choose the algorithm you would like to import, SoftImpute or SoftImputeALS and use it as follows:

from sparsemf import SoftImpute

model = SoftImpute()
X = my_data
model.fit(X)
model.predict( [users], [items] )

Relative Speed

Here is how the speed of SparseMF stacks up against GraphLab and FancyImpute:

Other Package Contents

In addition to these 'SoftImpute' and 'SoftImputeALS', the package also includes:

This GitHub repo also includes:

  • Unit tests for SoftImpute.
  • Benchmarking for SoftImputeALS against GraphLab and the FancyImpute SoftImpute implementation.

Resources

Here are some helpful resources:

  1. A Helpful Introduction to Matrix Factorization Recommenders.
  2. Benchmarks for MovieLens Dataset.
  3. Trevor Hastie's Hybrid Implementation of Soft-Impute and ALS.