NVIDIA / cutlass Public

Notifications You must be signed in to change notification settings
Fork 1.8k
Star 9.6k

Code
Issues 488
Pull requests 132
Discussions
Actions
Projects
Wiki
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security and quality
Insights

Pull requests: NVIDIA/cutlass

Labels 24 Milestones 3

New pull request New

132 Open 846 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[CuTeDSL] [demo] workaround of MLIR codegen

#3161 opened Apr 11, 2026 by yinghai

Loading…

[CuTeDSL][fix]: 1d bias epilogue fix

#3157 opened Apr 9, 2026 by leevan

Loading…

Add absf and floor to cute.math

#3156 opened Apr 8, 2026 by nandor

Loading…

[CuTeDSL] Update atomic_max_float32 to atomic_fmax in blockscaled GEMM amax example

#3155 opened Apr 8, 2026 by questa-wang Contributor

Loading…

Add support for empty dataclass arguments

#3152 opened Apr 7, 2026 by nandor

Loading…

Fix incorrect example paths in CuTeDSL docstrings

#3151 opened Apr 6, 2026 by Weili-0234

Loading…

[Hopper CuTeDSL] Add FP8 GEMM with 2xAcc

#3149 opened Apr 5, 2026 by Johnsonms Contributor

Loading…

[fix][CuTeDSL] Fix dynamic shape args not passed to JIT kernel

#3148 opened Apr 5, 2026 by Flink-ddd

Loading…

clamp max_workers to at least 1 for single-cpu systems

#3146 opened Apr 5, 2026 by knQzx

Loading…

[CuTeDSL] Fix incorrect package-data key in pyproject.toml

#3145 opened Apr 3, 2026 by Johnsonms Contributor

Loading…

[Fix] distributed gemm all reduce and reduce scatter examples

#3143 opened Apr 1, 2026 by Nicyzk

Loading…

correct BLayout stride in SM80 m16n8k32 int4 MMA traits

#3140 opened Apr 1, 2026 by zfmmmm

Loading…

Fix Hopper FMHA performance regression on CUDA < 13.1

#3137 opened Mar 31, 2026 by arvin-chou

Loading…

5 of 6 tasks

feat(CuTeDSL): print benchmark time from Blackwell dense_gemm CLI

#3136 opened Mar 30, 2026 by aidando73 Contributor

Loading…

[CuTeDSL] Add a render function hook to allow render layout natively

#3135 opened Mar 30, 2026 by kainzhong

Loading…

Fix incorrect warp arrangement in PitchLinearWarpStripedThreadMap

#3131 opened Mar 25, 2026 by HaoYuan-Gao

Loading…

Fix elementwise_apply.py

#3129 opened Mar 25, 2026 by HydraQYH Contributor

Loading…

[CuTeDSL] Add SM103 grouped block-scaled GEMM kernel and tests

#3124 opened Mar 23, 2026 by Johnsonms Contributor

Loading…

Enable strict C++ compiler warnings with -Werror

#3123 opened Mar 22, 2026 by maxwbuckley

Loading…

3 of 4 tasks

[bugfix] use acquire to prevent reordering.

#3118 opened Mar 20, 2026 by shubaoyu2 Contributor

Loading…

Fix typo in elementwise_add.py

#3116 opened Mar 20, 2026 by HydraQYH Contributor

Loading…

Add FlashMoE Publication

#3115 opened Mar 20, 2026 by osayamenja

Loading…

[FMHA] Add SM110 support for Blackwell FMHA example (77_blackwell_fmha)

#3112 opened Mar 18, 2026 by LiangSu8899

Loading…

fix(CuTeDSL): correct FP4 tensor K dimension in grouped blockscaled GEMM inactive-30d

#3102 opened Mar 13, 2026 by Hale423

Loading…

[docs] Fix same typo inactive-30d

#3098 opened Mar 9, 2026 by lhtin

Loading…

Previous 1 2 3 4 5 6 Next

Previous Next

ProTip! Adding no:label will show everything without a label.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!