Skip to content

KimRass/CLIP

Repository files navigation

'CLIP' (Radford et al., 2021) implementation from scratch in PyTorch

Pretrained Model

Linear Classification on ImageNet1k (mini) Dataset

# e.g.,
python3 linear_classification.py\
    --ckpt_path="../clip_flickr.pth"\
    --data_dir="../imagenet-mini/"\
    --n_epochs=64\
    --batch_size=128\
    --n_cpus=4 # Optional
  • Top-5 accuracy on validation set: 5.8%

Zero-shot Classification on ImageNet1k (mini) Dataset

# e.g.,
python3 zero_shot_classification.py\
    --ckpt_path="../clip_flickr.pth"\
    --data_dir="../imagenet-mini/"\
    --batch_size=16\
    --n_cpus=4\ # Optional
    --max_len=128\ # Optional
    --k=10 # Optional
  • Top-10 accuracy on train + validation set: 3.0%

Implementation Details

  • Temperature와 관련한 부분은 구현하지 않았습니다.
    • "The learnable temperature parameter was clipped to prevent scaling the logits by more than 100 which we found necessary to prevent training instability."

About

PyTorch implementation of 'CLIP' (Radford et al., 2021) from scratch and training it on Flickr8k + Flickr30k

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published