Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed Labeling Function #93

Open
jeff-hernandez opened this issue Jan 3, 2020 · 0 comments
Open

Distributed Labeling Function #93

jeff-hernandez opened this issue Jan 3, 2020 · 0 comments
Labels
enhancements Improvement to an existing feature

Comments

@jeff-hernandez
Copy link
Collaborator

jeff-hernandez commented Jan 3, 2020

  • Do we want to parallel compute the prediction engineering process?
  • Does the API in dask allow for a data slice generator?

Process

Initial thoughts on what the process might look like:
- A dask data slice would be input to the labeling function.
- The graph of the dask data slice would be extended (not computed) by the labeling function.
- The label times would be computed by iteration based on a search criteria or persisted on the cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancements Improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

1 participant