-
Notifications
You must be signed in to change notification settings - Fork 678
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[PyTorch][Flash Attn] Add fallback import for FA3
#2806
opened Mar 26, 2026 by
eattia-nvidia
Loading…
7 of 13 tasks
[PyT] Fix FSDP2 memory leaks for FP8 weight workspaces and transpose caches
#2805
opened Mar 26, 2026 by
pstjohn
Loading…
3 tasks done
Fix empty CUDA_ARCHITECTURES when SM120 is the only arch
#2804
opened Mar 26, 2026 by
sudhakarsingh27
Loading…
13 tasks
[PyT][Test] Add xfailing FSDP2 memory leak detection tests
#2803
opened Mar 25, 2026 by
pstjohn
Loading…
4 tasks done
[PyTorch] [CI] Capture subprocess stderr in distributed tests for better CI error re…
#2802
opened Mar 25, 2026 by
sudhakarsingh27
•
Draft
13 tasks
[JAX] Warmup FFIs with "initialize" stage
#2800
opened Mar 25, 2026 by
jberchtold-nvidia
•
Draft
13 tasks
[JAX] Add warning if using BSHD and max_segments_per_seq > 1
#2796
opened Mar 24, 2026 by
jberchtold-nvidia
Loading…
8 of 13 tasks
If model parameters are DTensors, optimizer states should also be DTensors.
#2795
opened Mar 24, 2026 by
cspades
Loading…
1 of 13 tasks
Avoid CPU offload wait_event for validation
#2793
opened Mar 23, 2026 by
vasunvidia
Loading…
13 tasks
[PyTorch] [torch.compile] Remove module reference from autograd function args
#2791
opened Mar 23, 2026 by
pggPL
Loading…
8 of 13 tasks
Optimize fp8 block scaling Allgather for FSDP2
#2789
opened Mar 23, 2026 by
vthumbe1503
Loading…
1 of 13 tasks
[Common][JAX] Add CUB TopK MaxPairs interface
#2784
opened Mar 20, 2026 by
huanghua1994
Loading…
8 of 13 tasks
Optimize naive top-k masking in fused router
#2783
opened Mar 19, 2026 by
yosh20004
Loading…
3 of 13 tasks
Fused Adam Support for MXFP8 + FSDP2 integration
#2780
opened Mar 18, 2026 by
vthumbe1503
•
Draft
13 tasks
[fused_router][pytorch] Optimize naive topk path and add perf benchmark
#2776
opened Mar 18, 2026 by
XiaomingFun233
Loading…
add mark_not_offload() interface for cpu_offload_v1
#2770
opened Mar 17, 2026 by
lhb8125
Loading…
13 tasks
GEMM + Swiglu fused Grouped MLP for MXFP8
2.14.0
MoE
#2769
opened Mar 17, 2026 by
ksivaman
Loading…
13 tasks
[Draft]Support for score_mod and score_mod_bprop in cuDNN's sdpa
#2767
opened Mar 16, 2026 by
vcherepanov-nv
Loading…
2 of 13 tasks
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.