Skip to content

BenediktAlkin/ImageNetSubsetGenerator

Repository files navigation

ImageNet subset generator

Generate a subsets from the original ImageNet1K dataset. Some commonly used subsets:

Usage

  • git clone https://github.com/BenediktAlkin/ImageNetSubsetGenerator
  • cd ImageNetSubsetGenerator

Generate subset

  • python main_subset.py --in1k_path <ImageNet1K_path> --out_path <out_path> --version in100_sololearn
  • this will copy the corresponding samples from the ImageNet1K_path to out_path
  • it can then be readily used with e.g. torchvision ImageFolder subset = ImageFolder(root=<out_path>)

For example: python main_subset.py --in1k_path /data/imagenet1k --out_path /data/imagenet1k_10percent_simclrv2 --version in1k_10percent_simclrv2

You can find all supported versions here or via python main_subset.py --help.

Check classes/samples of dataset

python main_statistics.py <path>

train n_classes: 1000
valid n_classes: 1000
train n_samples: 1282169
valid n_samples: 50000
train classes: ['n01440764', ...]
valid classes: ['n01440764', ...]