Skip to content

Add GPU-side Gumbel-max sampling for CUDA graph compatibility #493

Add GPU-side Gumbel-max sampling for CUDA graph compatibility

Add GPU-side Gumbel-max sampling for CUDA graph compatibility #493

Annotations

1 warning

test-mlx-llm (unsloth/Llama-3.2-1B-Instruct, llama-1b, true, nvfp4)  /  test-mlx-llm-llama-1b-custom-nvfp4

succeeded Apr 24, 2026 in 9m 55s