You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems the label model is also fitted on a training set and then evaluated on a test set in the original paper. However, when using weak supervision to generate labeled data, we care more about the quality of the generated labels than the generalization ability of a label model. For example, a label model provides perfect labels on the training set (which it was fitted on with an unsupervised learning process), and the label model provides random labels on a test set (on which it was not fitted). This is a perfect label model for the purpose of generating labeled data but will be the worst label model in the benchmark. My questions is: For the purpose of generating labeled data (which is then used to train an end model), is it really necessary to do train/val/test split to evaluate the label model? Can we just fit the unsupervised label model on the whole dataset and then evaluate on the whole dataset?
I appreciate any explanations.
The text was updated successfully, but these errors were encountered:
That's a great question and thanks for pointing it out!!
In fact, in our original paper, we adopt that setup to ease the comparison of the label model and end model. However, if we only care about the quality of generated label (in such case no end model would be involved), we can of course use the evaluation setup you mentioned. Actually, there're some work that follow the setup you mentioned, for example, this one.
It seems the label model is also fitted on a training set and then evaluated on a test set in the original paper. However, when using weak supervision to generate labeled data, we care more about the quality of the generated labels than the generalization ability of a label model. For example, a label model provides perfect labels on the training set (which it was fitted on with an unsupervised learning process), and the label model provides random labels on a test set (on which it was not fitted). This is a perfect label model for the purpose of generating labeled data but will be the worst label model in the benchmark. My questions is:
For the purpose of generating labeled data (which is then used to train an end model), is it really necessary to do train/val/test split to evaluate the label model? Can we just fit the unsupervised label model on the whole dataset and then evaluate on the whole dataset?
I appreciate any explanations.
The text was updated successfully, but these errors were encountered: