Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make dataset loaders on-the-fly #58

Open
2 of 3 tasks
cthoyt opened this issue Jan 21, 2022 · 0 comments
Open
2 of 3 tasks

Make dataset loaders on-the-fly #58

cthoyt opened this issue Jan 21, 2022 · 0 comments
Assignees
Labels

Comments

@cthoyt
Copy link
Collaborator

cthoyt commented Jan 21, 2022

I think it would be better to have the dataset download and processing happen client-side, then use pystow to store the results in a reliable place. This would also allow the TWOSIDES and DrugBank datasets, which require random negative sampling, to be used with multiple random seeds, e.g. to investigate the robustness of results. Further, it would allow for a more idiomatic dataset loader that's extensible to new datasets

Depends on:

@cthoyt cthoyt self-assigned this Jan 21, 2022
@cthoyt cthoyt added the dataset label Feb 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant