Incremental Training Best Practice #4692

Zongshun96 · 2024-04-23T02:27:13Z

Description

I believe the ease of incremental training is one highlight of VW. However, the incremental training best practice is not obvious on the documentation page. (Please kindly point me to the right page if there is already one.)

I am looking for answers to a couple of questions here. I am using VW 8.6.1.

I try to incrementally train my model with new labels and corresponding new features (Most new labels and features do not overlap with the trained data). However, as I add more labels, the model F1-Scores drops significantly. I had to retrain the model using all the data the model had seen to improve the F1 scores. Is this an expected way to do incremental training when introducing new labels? As shown in the fig, no data replay indicates the F1-scores without retraining, and with data replay indicates with retraining.
I was using the csoaa reduction, and the documentation says I should specify the number of labels before training. However, it seems the incremental training step can add new classifiers for the newly introduced labels, as shown by the high F1 scores I got. Is this expected behavior or a bug?

Any feedback is appreciated. Thank you!

The text was updated successfully, but these errors were encountered:

JohnLangford · 2024-05-09T16:30:58Z

W.r.t. (1), I'm not surprised to see that retraining tends to be helpful. Online learning algorithms are, to some extent, designed to forget the past in the process of adapting to the present.

W.r.t. (2), there are two different notions of csoaa: one where you need to specify the label up front and one where you specify a different set of features for each of a variable set of labels. Which do you have in mind? (What are the exact flags?)

Zongshun96 added the Documentation Issue in samples or documentation label Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incremental Training Best Practice #4692

Incremental Training Best Practice #4692

Zongshun96 commented Apr 23, 2024 •

edited

JohnLangford commented May 9, 2024

Incremental Training Best Practice #4692

Incremental Training Best Practice #4692

Comments

Zongshun96 commented Apr 23, 2024 • edited

Description

JohnLangford commented May 9, 2024

Zongshun96 commented Apr 23, 2024 •

edited