Repository for implementing Cross-Gradient Aggregation (CGA)
Paper accepted in 38th International Conference on Machine Learning (ICML 2021)
In the proposed CGA algorithm,
- each agent computes gradients of model parameters on its own data set;
- each agent sends its model parameters to its neighbors;
- each agent computes the gradients of its neighbors' models on its own data set and sends the cross gradients back to the respective neighbors;
- cross gradients and local gradients are projected into an aggregated gradient (using Quadratic Programming); which is then used to
- update the model parameter.
Example run:
python -m torch.distributed.launch --nnodes 1 --nproc_per_node 5 main.py --data_dist non-iid --opt CGA --epochs 5 --experiment 1 -log 5 --data CIFAR10 --model CNN --scheduler --momentum 0.5
- Fully Connected
- Ring
- Bipartite
- CGA: Cross-Gradient Aggregation
- CompCGA: Compressed Cross-Gradient Aggregation
- CDSGD: Consensus Based Distributed Stochastic Gradient Descent
- CDMSGD: Consensus Based Distributed Momentum Stochastic Gradient Descent
- SGP: Stochastic Gradient Push
- SGA
- SwarmSGD
- LR
- FCN
- CNN (CNN, Big_CNN, stl10_CNN, mnist_CNN)
- VGG (VGG11, VGG13, VGG16, VGG19)
- ResNet (resnet20, resnet32, resnet44, resnet56, resnet110, resnet1202, WideResNet28x10, PreResNet110)
Please cite our paper in your publications if it helps your research:
@article{esfandiari2021cross,
title={Cross-Gradient Aggregation for Decentralized Learning from Non-IID data},
author={Esfandiari, Yasaman and Tan, Sin Yong and Jiang, Zhanhong and Balu, Aditya and Herron, Ethan and Hegde, Chinmay and Sarkar, Soumik},
journal={arXiv preprint arXiv:2103.02051},
year={2021}
}
Cross-Gradient Aggregation for Decentralized Learning from Non-IID data


