Skip to content
This repository has been archived by the owner on Jan 26, 2021. It is now read-only.

Multiverso document

taifeng1205 edited this page Oct 21, 2016 · 12 revisions

Multiverso

Multiverso provides a parameter server architecture to scale machine learning tasks in distributed clusters. For each Multiverso job, computing nodes are logically grouped into two categories: server nodes and worker nodes. A group of server nodes are serving for the model storage and access. The global shared parameters are distributed at server nodes and each server node maintains one model partition. The workload and training data are distributed at a group of worker nodes. Each worker node, where client programs run on, contains a partition of training data and trains on its data partition based on the local model replica. The local model replica is get from the parameter server and the local produced updates is added into the parameter server.

Example

A Multiverso example is shown as follow.

#include <multiverso/multiverso.h>
#include <multiverso/util/log.h>
#include <multiverso/util/configure.h>
#include <multiverso/table/array_table.h>
using namespace multiverso;

int main(int argc, char* argv[]) {
  MV_SetFlag("sync", true);
  MV_Init(&argc, argv);
  
  ArrayTableOption<int> option;
  option.size = 500;
  ArrayWorker<int>*  table = MV_CreateTable(option);

  std::vector<int> model(100, 0);
  std::vector<int> delta(100, 1); 

  for (int iter = 0; iter < 100; ++iter) {
    table->Add(delta.data(), delta.size());
    table->Get(model.data(), model.size());
    // CHECK_EQ(model[i], (iter+1) * MV_NumWorkers());
  }

  MV_ShutDown();
}

Programming with Multiverso follows the following steps.

  1. Set the config and initialize the Multiverso environment.

       MV_SetFlag("sync", true); // optional
       MV_Init(&argc, argv);
    
  2. Create table, which is the abstraction of global shared model.

       ArrayTableOption<int> option;
       option.size = 500;
       ArrayWorker<int>*  table = MV_CreateTable(option);
    
  3. Write your main logic, for example your machine learning algorithm, by accessing the global parameter with Table API

       // Your learning algorithm to produce delta
       // ...
       // Communicating with PS 
       table->Add(delta.data(), delta.size());
       table->Get(model.data(), model.size());
    
  4. ShutDown

       MV_ShutDown();
    

For the detailed information about the API, please refer to the API document.