Support SDXL and its distributed inference #1514

Zars19 · 2024-04-28T14:50:02Z

The idea of patch parallelism comes from the CVPR 2024 paper Distrifusion. In order to reduce the difficulty of implementation, all communications in the example are synchronous.

This can help SDXL achieve better performance, especially when the resolution is very high

A100, 50 steps, 2048x2048, SDXL

Framework	sync_mode	n_gpu	latency(s)	speed_up	memory(MiB)
Torch	-	1	25.25	1x	42147
TRT	-	1	21.98	1.15x	42895
DistrFusion(Torch)	split_batch	2	13.33	1.89x	40173
Ours	split_batch	2	11.69	2.16x	42675
DistrFusion(Torch)	corrected_async_gn	4	8.27	3.05x	49087
DistrFusion(Torch)	full_sync	4	8.64	2.92x	51943
Ours	full_sync	4	7.73	3.27x	43073

Add distributed inference for UNet models and SDXL examples

8d3d8a1

Zars19 force-pushed the main branch from e95c53c to 8d3d8a1 Compare April 28, 2024 14:59

Zars19 changed the title ~~Add distributed inference for UNet models and SDXL examples~~ Support SDXL and its distributed inference Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support SDXL and its distributed inference #1514

Support SDXL and its distributed inference #1514

Zars19 commented Apr 28, 2024

Support SDXL and its distributed inference #1514

Are you sure you want to change the base?

Support SDXL and its distributed inference #1514

Conversation

Zars19 commented Apr 28, 2024