Optimization as a Model for Few-shot Learning

Pytorch implementation of Optimization as a Model for Few-shot Learning in ICLR 2017 (Oral)

Prerequisites

python 3+
pytorch 0.4+ (developed on 1.0.1 with cuda 9.0)
pillow
tqdm (a nice progress bar)

Data

Mini-Imagenet as described here
- You can download it from here (~2.7GB, google drive link)

Preparation

Make sure Mini-Imagenet is split properly. For example:

- data/
  - miniImagenet/
    - train/
      - n01532829/
        - n0153282900000005.jpg
        - ...
      - n01558993/
      - ...
    - val/
      - n01855672/
      - ...
    - test/
      - ...
- main.py
- ...

It'd be set if you download and extract Mini-Imagenet from the link above

Check out scripts/train_5s_5c.sh, make sure --data-root is properly set

Run

For 5-shot, 5-class training, run

bash scripts/train_5s_5c.sh

Hyper-parameters are referred to the author's repo.

For 5-shot, 5-class evaluation, run (remember to change --resume and --seed arguments)

bash scripts/eval_5s_5c.sh

Notes

Results (This repo is developed following the pytorch reproducibility guideline):

seed	train episodes	val episodes	val acc mean	val acc std	test episodes	test acc mean	test acc std
719	41000	100	59.08	9.9	100	56.59	8.4
-	-	-	-	-	250	57.85	8.6
-	-	-	-	-	600	57.76	8.6
53	44000	100	58.04	9.1	100	57.85	7.7
-	-	-	-	-	250	57.83	8.3
-	-	-	-	-	600	58.14	8.5

The results I get from directly running the author's repo can be found here, I have slightly better performance (~5%) but neither results match the number in the paper (60%) (Discussion and help are welcome!).
Training with the default settings takes ~2.5 hours on a single Titan Xp while occupying ~2GB GPU memory.
The implementation replicates two learners similar to the author's repo:
- learner_w_grad functions as a regular model, get gradients and loss as inputs to meta learner.
- learner_wo_grad constructs the graph for meta learner:
  - All the parameters in learner_wo_grad are replaced by cI output by meta learner.
  - nn.Parameters in this model are casted to torch.Tensor to connect the graph to meta learner.
Several ways to copy a parameters from meta learner to learner depends on the scenario:
- copy_flat_params: we only need the parameter values and keep the original grad_fn.
- transfer_params: we want the values as well as the grad_fn (from cI to learner_wo_grad).
  - .data.copy_ v.s. clone() -> the latter retains all the properties of a tensor including grad_fn.
  - To maintain the batch statistics, load_state_dict is used (from learner_w_grad to learner_wo_grad).

References

CloserLookFewShot (Data loader)
pytorch-meta-optimizer (Casting nn.Parameters to torch.Tensor inspired from here)
meta-learning-lstm (Author's repo in Lua Torch)

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
scripts		scripts
README.md		README.md
dataloader.py		dataloader.py
learner.py		learner.py
main.py		main.py
metalearner.py		metalearner.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scripts

scripts

README.md

README.md

dataloader.py

dataloader.py

learner.py

learner.py

main.py

main.py

metalearner.py

metalearner.py

utils.py

utils.py

Repository files navigation

Optimization as a Model for Few-shot Learning

Prerequisites

Data

Preparation

Run

Notes

References

About

Releases

Packages

Languages

personx000/meta-learning-lstm-pytorch

Folders and files

Latest commit

History

Repository files navigation

Optimization as a Model for Few-shot Learning

Prerequisites

Data

Preparation

Run

Notes

References

About

Topics

Resources

Stars

Watchers

Forks

Languages