Heterogeneous Continual Learning

Official PyTorch implementation of CVPR 2023 Highlight (Top 10%) paper Heterogeneous Continual Learning.

Authors: Divyam Madaan, Hongxu Yin, Wonmin Byeon, Pavlo Molchanov,

For business inquiries, please visit our website and submit the form: NVIDIA Research Licensing

TL;DR: First continual learning approach in which the architecture continuously evolves with the data.

Abstract

We propose a novel framework and a solution to tackle the continual learning (CL) problem with changing network architectures. Most CL methods focus on adapting a single architecture to a new task/class by modifying its weights. However, with rapid progress in architecture design, the problem of adapting existing solutions to novel architectures becomes relevant. To address this limitation, we propose Heterogeneous Continual Learning (HCL), where a wide range of evolving network architectures emerge continually together with novel data/tasks. As a solution, we build on top of the distillation family of techniques and modify it to a new setting where a weaker model takes the role of a teacher; meanwhile, a new stronger architecture acts as a student. Furthermore, we consider a setup of limited access to previous data and propose Quick Deep Inversion (QDI) to recover prior task visual features to support knowledge trans- fer. QDI significantly reduces computational costs compared to previous solutions and improves overall performance. In summary, we propose a new setup for CL with a modified knowledge distillation paradigm and design a quick data inversion method to enhance distillation. Our evaluation of various benchmarks shows a significant improvement on accuracy in comparison to state-of-the-art methods over various networks architectures.

Contribution of this work

We propose a novel CL framework called Heteroge- neous Continual Learning (HCL) to learn a stream of different architectures on a sequence of tasks while transferring the knowledge from past representations.
We revisit knowledge distillation and propose Quick Deep Inversion (QDI), which inverts the previous task parameters while interpolating the current task exam- ples with minimal additional cost.
We benchmark existing state-of-the-art solutions in the new setting and outperform them with our proposed method across a diverse stream of architectures for both task-incremental and class-incremental CL.

Prerequisites

$ pip install -r requirements.txt

🚀 Quick start

Training

python main.py --data_dir ../data/ --log_dir ./logs/scl/ -c configs/cifar10/distil.yaml --ckpt_dir ./checkpoints/c10/scl/distil/ --hide_progress --cl_default --validation --hcl

Evaluation

python linear_eval_alltasks.py --data_dir ../data/ --log_dir ./logs/scl/ -c configs/cifar10/distil.yaml --ckpt_dir ./checkpoints/c10/scl/distil/ --hide_progress --cl_default --hcl

To change the dataset and method, use the configuration files from ./configs.

Contributing

We'd love to accept your contributions to this project. Please feel free to open an issue, or submit a pull request as necessary. If you have implementations of this repository in other ML frameworks, please reach out so we may highlight them here.

🎗️ Acknowledgment

The code is build upon aimagelab/mammoth, divyam3897/UCL, kuangliu/pytorch-cifar, sutd-visual-computing-group/LS-KD-compatibility, and berniwal/swin-transformer-pytorch.

We thank the authors for their amazing work and releasing the code base.

Licenses

This work is made available under the NVIDIA Source Code License-NC. Click here to view a copy of this license.

For license information regarding the mammoth repository, please refer to its repository.
For license information regarding the UCL repository, please refer to its repository.
For license information regarding the pytorch-cifar repository, please refer to its repository.
For license information regarding the LS-KD repository, please refer to its repository.
For license information regarding the swin-transformer repository, please refer to its repository.

📌 Citation

If you find this paper useful, please consider staring 🌟 this repo and citing 📑 our paper:

@inproceedings{madaan2023heterogeneous,
  title={Heterogeneous Continual Learning},
  author={Madaan, Divyam and Yin, Hongxu and Byeon, Wonmin and Kautz, Jan and Molchanov, Pavlo},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2023}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
augmentations		augmentations
configs		configs
datasets		datasets
models		models
tools		tools
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
arguments.py		arguments.py
linear_eval_alltasks.py		linear_eval_alltasks.py
main.py		main.py
requirements.txt		requirements.txt

License

NVlabs/HCL

Folders and files

Latest commit

History

Repository files navigation

Heterogeneous Continual Learning

Abstract

Prerequisites

🚀 Quick start

Training

Evaluation

Contributing

🎗️ Acknowledgment

Licenses

📌 Citation

About

Resources

License

Stars

Watchers

Forks

Languages