Skip to content

Pull requests: NVIDIA/TransformerEngine

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

remove code duplication in a test
#915 opened Jun 11, 2024 by rybakov Loading…
Add the option to use SM for P2P comm in TP overlap
#914 opened Jun 11, 2024 by erhoo82 Loading…
1 of 11 tasks
Fix TE assert weight error
#913 opened Jun 11, 2024 by j316chuck Loading…
Fix local cpp tests after inplace build bug Something isn't working
#911 opened Jun 11, 2024 by ksivaman Loading…
8 of 11 tasks
[PyTorch] Expose multi_tensor_* kernels
#907 opened Jun 11, 2024 by yaox12 Loading…
8 of 11 tasks
disable using nvfuser when pytorch version >= 2.2
#905 opened Jun 11, 2024 by sudhakarsingh27 Loading…
1 of 4 tasks
[Common] Added JIT-compiled fused cast transpose kernels enhancement New feature or request
#903 opened Jun 10, 2024 by Oleg-Goncharov Loading…
6 of 11 tasks
[C/PyTorch] Removed MPI dependence in Userbuffers
#901 opened Jun 10, 2024 by denera Loading…
8 of 11 tasks
[JAX] Splitting cpp_extensions.py enhancement New feature or request jax
#899 opened Jun 7, 2024 by phu0ngng Loading…
5 of 11 tasks
Add norm_factor arg into DotProductAttention
#897 opened Jun 7, 2024 by BoxiangW Loading…
11 tasks
Add documentation for dot product attention
#889 opened Jun 4, 2024 by cyanguwa Loading…
2 of 4 tasks
Use unoptimized RMSNorm kernel if pointers are not aligned bug Something isn't working
#886 opened Jun 3, 2024 by timmoon10 Loading…
4 of 11 tasks
Fp8 model init factory
#880 opened May 30, 2024 by sudhakarsingh27 Draft
Avoid framework specific import from top level enhancement New feature or request
#862 opened May 22, 2024 by ksivaman Draft
6 of 11 tasks
[Common/PyTorch] Grouped GEMM via multi-stream cuBLAS
#853 opened May 17, 2024 by yaox12 Loading…
8 of 11 tasks
Generation tutorial for Gemma model
#829 opened May 1, 2024 by pggPL Loading…
8 of 11 tasks
[UB] Adding support for multinode nvlink
#815 opened Apr 26, 2024 by shamisp Loading…
Bug fix in DGRAD->RS overlap
#802 opened Apr 23, 2024 by vasunvidia Draft
[PyTorch] Fix minor bug in computing num_gqa_groups_per_partition bug Something isn't working
#777 opened Apr 13, 2024 by knowlsie Loading…
[C/PyTorch] Refactor and move userbuffers into TE/common
#760 opened Apr 8, 2024 by denera Loading…
10 of 13 tasks
Fix bhss bias format before sm90
#736 opened Mar 27, 2024 by zlsh80826 Loading…
ProTip! Filter pull requests by the default branch with base:main.