Create an evaluation & validation sets. #57

tttthomasssss · 2022-01-21T12:13:19Z

Currently, carbon-bot is one big NLU file, but doesn't have a dedicated evaluation and validation set. We have annotated 1200 additional NLU examples - data that has been collected from Facebook users or internal Rasa testers. With the additional data we can now create dedicated train/dev/eval sets. In order to create representative sets, it might make sense to first merge all existing data and sample individual sets from the merged amount of data in order to avoid data distribution shifts due to data collections at different points in time.

Definition of Done:

Get hold of the 1200 new annotated training examples (ask @tttthomasssss).
Merge the existing nlu.yml file with the new data, and create train/dev/eval splits (70/10/20) - that will approximately create a training set that is equal to the size of the current NLU file.
Create a PR with the change.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create an evaluation & validation sets. #57

Create an evaluation & validation sets. #57

tttthomasssss commented Jan 21, 2022

Create an evaluation & validation sets. #57

Create an evaluation & validation sets. #57

Comments

tttthomasssss commented Jan 21, 2022