Skip to content

Matching Label and Prediction Types Between Reductions

Byron Xu edited this page Feb 7, 2023 · 6 revisions

Primer on Learners

VW solves various machine learning problems using a reduction stack. The primary philosophy behind this is that we can reduce new problems / algorithms into problems we have already solved. The VW reduction stack consists of a sequence of learners, ending in a bottom learner that does not further reduce the problem. Examples enter from the top of the stack, and prediction propagate up starting from the bottom.

Learner Functions

Each learner in VW has its own setup, learn, and predict functions. The setup() function is used to set various features within the learner and process command line arguments. The learn() function processes labeled examples to update the current model, and the predict() function outputs a prediction given the current model. More information on learners can be found on the page What is a learner?

Directionality of learn() and predict() Functions

When considering a reduction stack, it is important to understand how labels and predictions are propagated through the stack via the learn() and predict() functions. Unless a learner is at the bottom of the stack (simply called a bottom learner), it will rely on its base immediately below it. In the reduction learner's learn() function, it will at some point call base.learn() in its own learn() function, and base.predict() in its own predict() function in order to invoke these functions in the learner beneath it.

A learner will call base.learn() after it has processed (and potentially altered) a label that was input. Calling base.learn() has the effect of giving this updated label to the base learner. On the other hand, a learner will call base.predict() before it processes (and potentially alters) a prediction, so it will apply its logic to the prediction returned by the base. Thus base.predict() can be thought of as fetching the prediction from the base learner and providing it to the current learner.

Consider the following image which demonstrates the directionality of labels and predictions between a reduction learner and its base. Note that the base can be either a bottom learner or another reduction learner, but this is not illustrated in the diagram.

Setting Input and Output Prediction and Label Types

In an effort to organize how learners interact with one another, we have introduced the two functions set_input_label_type() and set_output_label_type(). They are member functions of learner builders and are called when creating a learner. The set_input_label_type() function will specify the label type that is directly passed into a learner (which comes the learner above it), and set_output_label_type() will specify the label type that is passed into base.learn() (which will be sent to the learner below it).

Likewise, we also have set_input_prediction_type() and set_output_prediction_type(). Note that the direction of "input" and "output" is reversed here. The set_input_prediction_type() function will specify the prediction type which is returned by base.predict(), not the prediction type directly passed into the predict() function. This is because predict() acts on the prediction returned by base.predict(). set_output_prediction_type() will specify the prediction which is present at the end of the predict() function, as this is the output prediction of this learner that will be passed back up to any learner above it.

Output Label Type and Input Prediction Type of Bottom Learners

Given that the output label type is specified by what is passed into base.learn() and the input prediction type is specified by what is returned from base.predict(), it is ill-defined what these types should be for a bottom learner. As to not leave these values uninitialized, we have decided to set them to NOLABEL and NOPRED as placeholders.

Enforcing Label and Prediction Coherence Across Learners

The ultimate goal of setting each learner's input and output label and prediction types is to enforce that a reduction stack is set up correctly. Learner builders will perform a check at runtime for correctness.

Given the directionality of labels and predictions, the following 4 properties should always be met for a given reduction learner R:

  1. The learner above R should have the same output label type as the input label type of R
  2. The learner below R should have the same input label as the output label of R
  3. The learner above R should have the same input prediction type as the output prediction type of R
  4. The learner below R should have the same output prediction type as the input prediction type of R

And the following 4 properties should always be met for a given bottom learner B:

  1. The learner above B should have the same output label type as the input label type of B
  2. There is no learner below B, and the output label type of B is set to NOLABEL
  3. The learner above B should have the same input prediction type as the output prediction type of B
  4. There is no learner below B, and the input prediction type of B is set to NOPRED
Clone this wiki locally