Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create an evaluation & validation sets. #57

Open
3 tasks
tttthomasssss opened this issue Jan 21, 2022 · 0 comments
Open
3 tasks

Create an evaluation & validation sets. #57

tttthomasssss opened this issue Jan 21, 2022 · 0 comments

Comments

@tttthomasssss
Copy link
Contributor

Currently, carbon-bot is one big NLU file, but doesn't have a dedicated evaluation and validation set. We have annotated 1200 additional NLU examples - data that has been collected from Facebook users or internal Rasa testers. With the additional data we can now create dedicated train/dev/eval sets. In order to create representative sets, it might make sense to first merge all existing data and sample individual sets from the merged amount of data in order to avoid data distribution shifts due to data collections at different points in time.

Definition of Done:

  • Get hold of the 1200 new annotated training examples (ask @tttthomasssss).
  • Merge the existing nlu.yml file with the new data, and create train/dev/eval splits (70/10/20) - that will approximately create a training set that is equal to the size of the current NLU file.
  • Create a PR with the change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant