Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trains forever without stopping #881

Open
cuuupid opened this issue Jun 12, 2020 · 2 comments
Open

Trains forever without stopping #881

cuuupid opened this issue Jun 12, 2020 · 2 comments
Labels

Comments

@cuuupid
Copy link

cuuupid commented Jun 12, 2020

After assembling a small dataset (40 utterances, 4 entities) and trying to train, the script finishes fitting the intent parser relatively quickly but then gets stuck on slot filling. After 4 hours it still seems to be running (and using 17% of CPU).

snips-nlu generate-dataset en test.yml > test.json
snips-nlu train test.json testmodel

image

It also does not seem to be using the GPU although I'm not sure that would change much here. FWIW it also gets stuck when using it as a module within a Python script (tested by attempting the quickstart tutorial in the docs).

Environment:

  • OS: Windows 10 (CUDA 9.0, Geforce 1060x), also occurs in WSL (no GPU)
  • python version: 3.6.6 (Anaconda 64-bit)
  • snips-nlu version: 0.20.2
@cuuupid cuuupid added the bug label Jun 12, 2020
@adrienballsonos
Copy link

Hey @pshah123 ,
Sorry for the late reply.

  1. Have you been able to reproduce this consistently?
  2. What does the "scheduling" intent look like: does it contain builtin entities?

That would be ideal if you could share the data of this intent, or provide a subset of the data which triggers this issue.

Thanks!

@cuuupid
Copy link
Author

cuuupid commented Jun 18, 2020

Have you been able to reproduce this consistently?

So regarding this, yes but only under the conditions above. For now I've been able to get the model to work inside a Docker container, on the same host OS as above but the actual container is running Ubuntu 18. The model works superbly in there (which is pretty amazing considering the dataset is so small so kudos to you!). Unfortunately trying to run the train script on Win 10 just hangs forever.

What does the "scheduling" intent look like: does it contain builtin entities?

I can't share the full data but I can provide a subset:

type: intent
name: scheduling
utterances:
type: intent
name: scheduling
utterances:
  - "Do [person:person](you) have a few minutes [time:time](later today or tomorrow) for a brief [place:place](follow up call)"
---
type: entity
name: person
automatically_extensible: yes
values:
  - "John"
---
type: entity
name: time
automatically_extensible: yes
values:
  - "tomorrow"
---
type: entity
name: place
automatically_extensible: yes
values:
  - "Montreal"
---
type: entity
name: subject
automatically_extensible: yes
values:
  - "Birthday Party"

There are other utterances/values which follow a similar format (double quoted ascii, YML).

Thanks for the help and kudos again on making such a powerful classifier :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants