Skip to content

Add GPU-side Gumbel-max sampling for CUDA graph compatibility #493

Add GPU-side Gumbel-max sampling for CUDA graph compatibility

Add GPU-side Gumbel-max sampling for CUDA graph compatibility #493

Annotations

1 warning

test-mlx-llm (unsloth/Llama-3.2-1B-Instruct, llama-1b, true, 4w)  /  test-mlx-llm-llama-1b-custom-4w

succeeded Apr 24, 2026 in 10m 15s