Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: ImageNet data loader #100

Open
adrhill opened this issue Mar 22, 2022 · 3 comments · May be fixed by #146
Open

Feature request: ImageNet data loader #100

adrhill opened this issue Mar 22, 2022 · 3 comments · May be fixed by #146

Comments

@adrhill
Copy link

adrhill commented Mar 22, 2022

ImageNet is quite large and locked behind terms of access that require an account.

However it would be nice to be able to either

  • set a config (or ENV) variable to download ImageNet through MLDatasets
  • point MLDatasets to a local copy of ImageNet

and be able to use MLDatasets' interface of

train_x, train_y = ImageNet.traindata()
test_x,  test_y  = ImageNet.testdata()

as well as ImageNet.convert2image(x).
Ideally data would be in WHCN format for Flux and Metalhead models.

@CarloLucibello
Copy link
Member

As a reference, an example of ImageNet usage
https://github.com/avik-pal/Lux.jl/tree/main/examples/ImageNet

@lorenzoh
Copy link
Contributor

lorenzoh commented May 4, 2022

For reference, a ManualDataDep may be useful for when a dataset requires the user to perform some manual steps.

@adrhill
Copy link
Author

adrhill commented Jun 21, 2022

Thanks for the pointers, I will open a draft PR for this!

Since there is not only one version of ImageNet, I propose to mirror PyTorch and have MLDatasets.ImageNet refer to the ImageNet 2012 Classification Dataset (ILSVRC 2012-2017). The ImageNet authors themselves call it "the most highly-used subset of ImageNet".

@adrhill adrhill linked a pull request Jun 23, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants