Skip to content

frankwang345/pruning-from-scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pruning from Scratch

official implementation of the paper Pruning from Scratch

Requirements

  • pytorch == 1.1.0
  • torchvision == 0.2.2
  • apex @ commit: 574fe24

CIFAR10

  • learning channel importance gates from randomly initialized weights
python script/learn_gates.py -a ARCH --gpu GPU_ID --seed SEED -s SPARSITY -e EXPANSION

where ARCH is network architecture type, SPARSITY is the sparsity ratio $r$ in regularization term, EXPANSION is expansion channel number of initial conv layer.

  • pruning based on channel gates
python script/prune_model.py -a ARCH --gpu GPU_ID --seed SEED -s SPARSITY -e EXPANSION -p RATIO

where RATIO is the pruned model MACs reduction ratio, larger ratio indicates more compact model.

  • training pruned model from scratch
python script/train_pruned.py -a ARCH --gpu GPU_ID --seed SEED -s SPARSITY -e EXPANSION -p RATIO --budget_train

where --budget_train activates the budget training scheme (Scratch-B) proposed in Rethinking the Value of Network Pruning, which trains the pruned model for the same amount of computation bud- get with the full model. Empirically, this training scheme is crucial for improving the pruned model performance.

ImageNet

  • prepare imagenet dataset following the instructions in link, which results in an imagenet folder with train and val sub-folders.
  • generate image index by
python script/prepare_imagenet_list.py --data_dir IMAGENET_DATA_DIR/train --dump_path data/train_images_list.pkl
python scrtpt/prepare_imagenet_list.py --data_dir IMAGENET_DATA_DIR/val --dump_path data/val_images_list.pkl
  • learning channel importance gates from randomly initialized weights
python script/learn_gates_imagenet.py -a ARCH --gpu GPU_ID -s SPARSITY -e EXPANSION -m MULTIPLIER

where MULTIPLIER is used to control the expansion of channel number on the backbone outputs, while EXPANSION is used to enlarge the intermediate channel numbers in InvertedResidual and Bottleneck blocks.

  • pruning based on channel gates
python script/prune_model_imagenet.py -a ARCH --gpu GPU_ID -s SPARSITY -e EXPANSION -m MULTIPLIER -p RATIO
  • training pruned model from scratch (single node multiple gpus)
python -m torch.distributed.launch --nproc_per_node=NUM_GPU script/train_pruned_imagenet.py \
    -a ARCH -e EXPANSION -s SPARSITY -p RATIO -m MULTIPLIER \
    -b TRAIN_BATCH_SIZE --lr LR --wd WD --lr_scheduler SCHEDULER \
    --budget_train --label_smooth

where SCHEDULER is learning rate scheduler type, 'multistep' for ResNet50, 'cos' for MobileNets.

Citation

@inproceedings{wang2020pruning,
    title={Pruning from Scratch},
    author={Wang, Yulong and Zhang, Xiaolu and Xie, Lingxi and Zhou, Jun and Su, Hang and Zhang, Bo and Hu, Xiaolin},
    booktitle={Proceedings of the 29th International Joint Conference on Artificial Intelligence},
    year={2020},
    publisher={AAAI Press},
    address={New York, USA}
}