Continual Learning with Vision Transformers

Update: Our paper wins the best runner-up award at the 3rd CLVision workshop.

This repo hosts the official implementation of our CVPR 2022 workshop paper Towards Exemplar-Free Continual Learning in Vision Transformers: an Account of Attention, Functional and Weight Regularization.

TLDR; We introduce attentional and functional variants for asymmetric and symmetric Pooled Attention Distillation (PAD) losses in Vision Transformers:

Running the code

Given below are two examples for the asymmetric attentional and functional variants pooling along the height dimension on ImageNet-100.

Attentional variant:

>>> python3 -u src/main_incremental.py --datasets imagenet_32_reduced --network Early_conv_vit --approach olwf_asym --nepochs $NEPOCHS --log disk --batch-size 1024 --gpu $GPU --exp-name dummy_attentional_exp --lr 0.01 --seed ${seed} --lamb 1.0 --num-tasks $NUM_TASKS --nc-first-task $NC_FIRST_TASK --lr-patience 20 --plast_mu 1.0 --pool-along 'height'   l

Functional variant:

>>> python3 -u src/main_incremental.py --datasets imagenet_32_reduced --network Early_conv_vit --approach olwf_asympost --nepochs $NEPOCHS --log disk --batch-size 1024 --gpu $GPU --exp-name dummy_functional_exp --lr 0.01 --seed ${seed} --lamb 1.0 --num-tasks $NUM_TASKS --nc-first-task $NC_FIRST_TASK --lr-patience 20 --plast_mu 1.0 --pool-along 'height'

The corresponding runs for symmetric variants would then be:

Attentional variant:

>>> python3 -u src/main_incremental.py --datasets imagenet_32_reduced --network Early_conv_vit --approach olwf_asym --nepochs $NEPOCHS --log disk --batch-size 1024 --gpu $GPU --exp-name dummy_attentional_exp --lr 0.01 --seed ${seed} --lamb 1.0 --num-tasks $NUM_TASKS --nc-first-task $NC_FIRST_TASK --lr-patience 20 --plast_mu 1.0 --pool-along 'height' --sym

Functional variant:

>>> python3 -u src/main_incremental.py --datasets imagenet_32_reduced --network Early_conv_vit --approach olwf_asympost --nepochs $NEPOCHS --log disk --batch-size 1024 --gpu $GPU --exp-name dummy_functional_exp --lr 0.01 --seed ${seed} --lamb 1.0 --num-tasks $NUM_TASKS --nc-first-task $NC_FIRST_TASK --lr-patience 20 --plast_mu 1.0 --pool-along 'height' --sym

Other available continual learning approaches with Vision Transformers include:

EWC • Finetuning • LwF • PathInt

The detailed scripts for our experiments can be found in scripts/.

Cite

If you found our implementation to be useful, feel free to use the citation:

@InProceedings{Pelosin_Jha_CVPR,
    author    = {Pelosin, Francesco and Jha, Saurav and Torsello, Andrea and Raducanu, Bogdan and van de Weijer, Joost},
    title     = {Towards Exemplar-Free Continual Learning in Vision Transformers: An Account of Attention, Functional and Weight Regularization},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
    month     = {June},
    year      = {2022},
    pages     = {3820-3829}
}

Acknowledgement

This repo is based on FACIL.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
docs/_static		docs/_static
scripts		scripts
src		src
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs/_static

docs/_static

scripts

scripts

src

src

LICENSE

LICENSE

README.md

README.md

environment.yml

environment.yml

requirements.txt

requirements.txt

Repository files navigation

Continual Learning with Vision Transformers

Running the code

Cite

Acknowledgement

About

Releases

Packages

Languages

License

srvCodes/continual_learning_with_vit

Folders and files

Latest commit

History

Repository files navigation

Continual Learning with Vision Transformers

Running the code

Cite

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Languages