Skip to content
Benoit Rostykus edited this page Aug 1, 2017 · 1 revision

Philosophy

Vectorflow is a minimalist neural network library designed with performance in mind for single-machine CPU workflows. If you're looking for training quickly a shallow feed-forward net on a high-dimensional problem and your data can fit on a single machine, Vectorflow can probably help you. If you want to run deep CNNs on a GPU, Vectorflow won't be of any help.

Examples

Have a look at examples/ for some tutorials:

  • rcv1.d: logistic regression on sparse data
  • mnist.d: multinomial classification on dense data with a small net

In a nutshell

1) Defining your model

Vectorflow main abstraction is NeuralNet: a Directed Acyclic Graph mapping a set of input vectors to a set of output vectors.

Stacks are some of the simplest types of DAGs. For example, a sparse linear model in a 1k-dimensional space can simply be defined as:

import vectorflow;
auto net = NeuralNet()
    .stack(SparseData(1_000))
    .stack(Linear(1));

2) Building a dataset

Having efficient IO is the key to great performance of shallow architectures on sparse data. Vectorflow doesn't strictly enforce a data layout or format. Implementing a data reader is left to the user, to make sure that efficiency can be achieved on arbitrary data sources (row-based or columnar formats, in-memory or out-of-core, read from local disk or from the network etc.).

Vectorflow uses compile-time introspection to find how to map fields of a training set row to the InputLayer(s) of the graph.

Vectorflow will order lexicographically row field names prefixed with features and feed the fields to the InputLayers of the graph, in the order these roots were added initially to the graph.

For example, in the simple and common case of a single InputLayer we could define our row type to be:

struct Obs {
    float label;
    SparseF[] features;
    string foo;
    ushort bar;
}

which would match the architecture described at the previous section, since the features field of Obs will be fed to the SparseData bottom layer of the graph during training.

3) Training

Once the data iterator is ready, we can train the net against a predefined loss function or a custom one. Vectorflow offers an easy way to specify a custom loss function, through a callback mechanism. For example, we could learn the median value of label as a function of features this way:

 auto loss_grad = float delegate(float[] nn_out, ref Obs o, ref float[] grads)
 {
    auto pred = nn_out[0]; // this is the predictions of the net after forward-prop
    if(pred > o.label) // gradient of |pred - label| with respect to pred
        grads[0] = 1.0f;
    else
        grads[0] = -1.0f;
    
    return fabs(pred - o.label); // return loss value so it's monitored during training
 }

Obs[] data = load_dataset();

net.learn(data, loss_grad, AdaGrad(10, 0.1, 500), true, 3);

This would train net with the AdaGrad optimizer (10 passes, learning rate of 0.1 and mini-batch size of 500) over 3 cores.

4) Predictions

Once trained, this net can be queried with net.predict(features_vector). It would return the predicted vector (which in this case is of size 1).