SignalGate is a semantic routing layer for OpenClaw. It exposes an OpenAI-compatible API on loopback and routes each request to the right upstream model tier (budget, balanced, premium) using local embeddings + KNN and hard capability gates.
Status: released. Current version: 1.0.3
- Stop defaulting every prompt to the most expensive model.
- Keep OpenClaw pipelines intact (no prompt rewriting, no rule forests).
- Make routing observable, deterministic under failure, and safe for tool-driven automation.
- Local OpenAI-compatible base URL:
http://127.0.0.1:8765/v1
- Primary endpoints:
GET /healthzGET /readyzGET /v1/modelsGET /metrics(lightweight JSON counters)POST /v1/chat/completions(streaming and non-streaming)
signalgate/auto- semantic tier routingsignalgate/budget- force budget tiersignalgate/balanced- force balanced tiersignalgate/premium- force premium tiersignalgate/chat-only- disable tool usage for the request
- Security gates
- Optional auth header on loopback (raw token or
Bearer <token>) - Request body size limit
- Optional request field stripping
- Capability gates (manifest-driven) SignalGate filters candidates by required capabilities inferred from the request shape:
- tools / tool_choice
- JSON/schema response_format
- streaming
- context window and max output
- Tier selection
- For
signalgate/auto, SignalGate uses:- local embeddings (GGUF via llama.cpp)
- KNN classifier trained on labeled workloads
- uncertainty promotion (similarity threshold + margin threshold)
- high-risk floor: tools/JSON => at least balanced
- Candidate scoring + provider preference
- Provider preference biases selection (Gemini first, OpenAI second by default)
- Scoring uses manifest pricing + preference bias
- Stickiness (consistent hashing) reduces provider/model flip-flop
- Execution + robustness
- Bounded queue and bounded concurrency (global/provider/model semaphores)
- Per-model circuit breakers (cooldown + half-open)
- Deterministic failover ladder
- No-tools: allow one failover
- Tools/JSON: no retry after side effects may occur
- OpenAI upstream (chat/completions) including streaming passthrough
- Gemini upstream (generateContent / streamGenerateContent) via format adaptation (no rewriting)
Key controls are configured via security.* in runtime config:
- Auth token header (recommended)
- HTTPS-only upstream enforcement + hostname allowlist
- Upstream error-body redaction unless debug
- Default hashed user forwarding (or drop)
- Optional request field stripping (
strip_unknown) to reduce pass-through surface
- Runtime config:
docs/config.schema.json(example:docs/config.example.json) - Capability manifest:
docs/manifest.schema.json(example:docs/manifest.example.json) - KNN dataset contract:
docs/DATASET.md
Optional (v1.0.3):
- Two-phase tools routing:
features.enable_two_phase_tools=true(tuning:two_phase.min_margin_for_plan). - Metrics JSONL sink (routing outcomes only, no prompts):
metrics.enabled=true+metrics.jsonl_path. - Cost baseline for savings percent:
cost.baseline_model_key=<manifest model key>. - Cost uses upstream
usagetoken counts when available. Estimation is off by default (cost.allow_estimates=false).
uv sync --extra dev --extra embedexport SIGNALGATE_CONFIG_PATH=./config.json
export SIGNALGATE_TOKEN=... # if auth enabled
uv run signalgateIn config:
server.uds_path=/tmp/signalgate.sock
Run:
uv run signalgateSee docs/TESTING.md.
See docs/LOAD_TESTING.md.
Default suite:
uv run pytestLive upstream tests:
uv run pytest -m e2eSee docs/OPENCLAW_INTEGRATION.md.
See docs/UPSTREAMS.md for Gemini, OpenAI, and Anthropic configuration patterns.
Licensed under the GNU Affero General Public License v3.0 or later (AGPL-3.0-or-later). See LICENSE.
