MulKD:
We are excited to announce the first public release of our Multi-Level Knowledge Distillation (MulKD) framework, implemented in PyTorch!
This release provides a complete, end-to-end pipeline for exploring hierarchical knowledge distillation on the CIFAR-10 dataset. The framework is designed to distill knowledge from a large, high-performance "Grandmaster" model down through successive layers of teachers to train highly efficient and accurate student models.
This version is the result of a significant code refactoring effort, moving from a single monolithic script to a clean, modular, and maintainable codebase.
What's in this Release?
- 🧪 Complete MulKD Pipeline: A full implementation of the hierarchical distillation cascade: Grandmaster → Level 1 TAs → Master → Level 2 TAs → Compact Model → Final Students.
- 🧩 Modular & Refactored Code: The entire project has been separated into logical modules for better readability and extensibility:
config.py
for easy configuration of all hyperparameters.models.py
for all model architectures.losses.py
containing custom distillation losses.main.py
as the central orchestrator.
- 🔥 Hybrid Distillation Methods: Combines classic logit-based Knowledge Distillation (KD) with modern Contrastive Representation Distillation (CRD) for improved performance.
- ⚙️ Robust Training & Evaluation:
- Automatic checkpointing to save the best model for each scenario.
- Support for resuming training runs.
- Automated generation of detailed performance plots (training curves, confusion matrices) and summary tables.
- 🚀 Configurable Experiments: Easily modify the dataset size (
DATASET_SUBSET_FRACTION
), training epochs, learning rates, and distillation parameters inconfig.py
to run your own experiments.
Getting Started
-
Download the source code attached below (
MulKD-Source code (zip)
).
MulKD-Source code.zip -
Install the necessary dependencies from the
requirements.txt
file:pip install -r requirements.txt
-
Run the main evaluation script:
python main.py
For a quick test run, you can modify
DATASET_SUBSET_FRACTION
inconfig.py
to a small value like0.2
(20% of the data).
We welcome feedback and contributions from the community!