Hello there! I'd like to request one of two alternative possibilities. The goal is to have models that don't reason.
I noticed that all of the models that can run on my computer (i.e., which are selectable to me) are reasoning models. I am aware of the benefits of reasoning, however especially the Ministral model just cannot stop pumping text into its context, and it often takes an excruciating amount of time until the model actually begins generating response text. For many tasks that I personally have, I actually don't need/want reasoning, which saves on compute and time.
Option A: Disable Reasoning Selectively
The llama.cpp server offers the CLI flag --reasoning-budget N, and, checking the server model selection, I can see that there are arguments passed (serverArgs in the catalog). If that doesn't completely derail reasoning models, it would be great if we could tune this parameter/toggle it on/off.
Option B: Offer non-reasoning models
The alternative to this would be to offer at least one or so non-reasoning models in the catalog.
Or is there a way of adding this setting into the WebUI to toggle this on-demand? In that case I could even try to get that done myself/request this on the correct other repository.
Thank you for considering & the great work!
Hello there! I'd like to request one of two alternative possibilities. The goal is to have models that don't reason.
I noticed that all of the models that can run on my computer (i.e., which are selectable to me) are reasoning models. I am aware of the benefits of reasoning, however especially the Ministral model just cannot stop pumping text into its context, and it often takes an excruciating amount of time until the model actually begins generating response text. For many tasks that I personally have, I actually don't need/want reasoning, which saves on compute and time.
Option A: Disable Reasoning Selectively
The llama.cpp server offers the CLI flag
--reasoning-budget N, and, checking the server model selection, I can see that there are arguments passed (serverArgsin the catalog). If that doesn't completely derail reasoning models, it would be great if we could tune this parameter/toggle it on/off.Option B: Offer non-reasoning models
The alternative to this would be to offer at least one or so non-reasoning models in the catalog.
Or is there a way of adding this setting into the WebUI to toggle this on-demand? In that case I could even try to get that done myself/request this on the correct other repository.
Thank you for considering & the great work!