Skip to content
Patrik Schmidt edited this page Aug 22, 2013 · 3 revisions

Preface

Mahout is a scalable machine learning framework with tons of nifty algorithms:

  • Collaborative Filtering
  • User and Item based recommenders
  • K-Means, Fuzzy K-Means clustering
  • Mean Shift clustering
  • Dirichlet process clustering
  • Latent Dirichlet Allocation
  • Singular value decomposition
  • Parallel Frequent Pattern mining
  • Complementary Naive Bayes classifier
  • Random forest decision tree based classifier
  • High performance java collections (previously colt collections)

Installation

Prerequesites

To get the full power it is mandatory to use Hadoop and it's Map/Reduce pipeline.

Download Mahout 0.8 from http://mirror.synyx.de/apache/mahout/

Configuration

Examples

Data set

Use Movielens data sets available at http://www.grouplens.org/node/73

Clone this wiki locally