Skip to content
afiaka87 edited this page Apr 22, 2021 · 8 revisions

Train a VAE or DALLE with your own dataset.

Train with multiple GPUS and/or multiple distributed nodes.

https://github.com/lucidrains/DALLE-pytorch/wiki/Multi-GPU-and-Multi-Node

Train on vast.ai.

https://github.com/lucidrains/DALLE-pytorch/wiki/Vast.ai-Sparse-Attention

Attention Types

https://github.com/lucidrains/DALLE-pytorch/wiki/Attention-Layers

(Experimental) Train with Deepspeed ZeRO Infinity

Brand new from microsoft - if you can manage to install it, you'll get support for both parameter and optimizer CPU offloading, as well as a host of other features labelled under the category "ZeRO Stage 3".

Installation is tough to say the least.

Train with Deepspeed ZeRO Infinity

Citations

https://github.com/lucidrains/DALLE-pytorch/wiki/Citations