Pull requests: NVIDIA/apex
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Use master weights for bfloat16 FusedAdam when master_weights=True
#1731
opened Sep 22, 2023 by
cbcase
Loading…
Make distributed fused lamb test names friendly to keyword filtering
contrib
#1698
opened Jul 20, 2023 by
crcrpar
Loading…
A FasterRMSNorm implementation (based on FasterLayerNorm)
#1688
opened Jun 30, 2023 by
Njuapp
Loading…
[sparse]update support for arbitrary N:M settings sparse
#1631
opened Apr 2, 2023 by
LeiWang1999
Loading…
Fix incorrect initialization of GenericFusedScaleMaskSoftmax
#1586
opened Feb 15, 2023 by
qmdnls
Loading…
Support grad_output with noncontiguous strides in LinearWithGradAccum…
#1573
opened Jan 23, 2023 by
cat-state
Loading…
Optimize halo exchange in spatial-parallel bottleneck layer
#1569
opened Jan 20, 2023 by
timmoon10
Loading…
Instance norm (and batch, layer norm) using NVFuser Python frontend only
#1564
opened Jan 17, 2023 by
jacobhinkle
•
Draft
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.