fix(proxy): guard CCR tool injection against frozen prefix to preserve cache#298
Merged
chopratejas merged 1 commit intochopratejas:mainfrom Apr 28, 2026
Conversation
…e cache The Anthropic handler's CCR injector path applied a frozen_message_count guard to system instruction injection but not to tool injection. When Kompress fired for the first time in a session, the tools array was mutated unconditionally, invalidating Anthropic's prefix cache and dropping cache_read_input_tokens to zero on calls where ~48K tokens were previously being cached. Mirror the existing inject_system_instructions guard for inject_tool: when frozen_message_count > 0, defer tool injection so the warm prefix stays intact. Adds test_ccr_tool_injection_disabled_when_prefix_frozen as a direct companion to the existing test_ccr_system_instruction_injection_ disabled_when_prefix_frozen. Fixes chopratejas#294 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Owner
|
Thank you for fixing this! Appreciate it - looking forward to the follow up PR |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #294 — the CCR tool injection path had no
frozen_message_countguard, so the first Kompress call in a session mutated the tools array and busted Anthropic's prefix cache (droppingcache_read_input_tokensfrom ~48K → 0).The system instruction injection path already had this guard at
proxy/handlers/anthropic.py:963-968. This PR mirrors it forinject_tool— the simplest version of the fix suggested in the issue.Change
headroom/proxy/handlers/anthropic.py— whenfrozen_message_count > 0, setinject_tool=Falseand log a deferral message (matching the existing system-instruction guard's tone). The injector is still constructed (it may still need to scan for compressed content), but it will not mutate the tools array.Test
Adds
test_ccr_tool_injection_disabled_when_prefix_frozenintests/test_proxy_anthropic_cache_stability.py— a direct companion to the existingtest_ccr_system_instruction_injection_disabled_when_prefix_frozen, using the same_FakePrefixTracker+ monkeypatchedCCRToolInjectorpattern. Asserts the injector receivesinject_tool=Falsewhenfrozen_count=1.Test plan
ruff checkon modified files — cleanruff format --checkon modified files — cleantest_proxy_anthropic_cache_stability.py— 17/17 pass (including the new test)test_ccr*,test_proxy_ccr) — 140 pass; the only failures are pre-existingheadroom._coreRust-extension import errors unrelated to this pathNotes
The issue suggests a more complete variant that defers tool injection until the next call where
frozen_message_count == 0(i.e. injects "for free" during a TTL re-warm). That's a follow-up — this PR ships the simple, symmetric fix that resolves the cache bust. Happy to extend if maintainers prefer the deferral approach.🤖 Generated with Claude Code