From 049ca9cab2ec3c288c1a9c39bc7b3bba760c8a4d Mon Sep 17 00:00:00 2001
From: chopratejas <chopratejas@gmail.com>
Date: Tue, 28 Apr 2026 11:28:25 -0700
Subject: [PATCH 1/2] fix(smart_crusher): re-attach TOIN learning loop + audit
 known regressions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The Stage 3c.1b retirement of the Python SmartCrusher silently disconnected
three subsystems. The audit on 2026-04-28 caught them; this commit fixes
what's fixable today and labels the rest visibly.

## TOIN learning loop — fixed

Before: `ContentRouter._record_to_toin` skipped SmartCrusher on the
assumption SmartCrusher recorded its own TOIN events. The retired Python
class did. The Rust port doesn't know about TOIN. Net result: the
highest-traffic compression strategy stopped fueling the learning loop,
silently.

Fix: shim's `crush()` and `_smart_crush_content()` now call
`toin.record_compression()` after a real compression. Filtered on
`strategy != "passthrough"` because the Rust port flips
`was_modified=True` from JSON whitespace re-canonicalization. Best
effort: TOIN failures are logged at debug level and never break
compression.

Token estimates use `len(json) // 4` (the rule the retired Python used)
because the router doesn't pass a tokenizer down to this layer and
re-tokenizing here would dominate the recording cost.

7 tests in `tests/test_smart_crusher_toin_attachment.py`:
- crush() records on real compression, doesn't on passthrough
- structurally-similar inputs land on the same pattern
- _smart_crush_content() records (legacy apply() path)
- TOIN errors don't break compression
- non-JSON input doesn't record
- inject_retrieval_marker=False emits a WARNING

## CCR marker emission knob — labelled, not yet fixed

`ccr_config.inject_retrieval_marker=False` is not honored — the Rust
port emits `<<ccr:HASH N_rows_offloaded>>` markers in `dropped_summary`
unconditionally. Today the production default has the flag True so no
one is hitting the gap, but the silent-disconnect was a real
regression. Shim now logs a WARNING when callers pass `False` so the
mismatch is visible. Fix needs a Rust-side gate; tracked in
`RUST_DEV.md`.

## Custom relevance scorer — labelled, not yet fixed

`relevance_config` and `scorer` constructor args are accepted for
source compatibility but the Rust default `HybridScorer` always
runs. Shim was logging this at debug; bumped to WARNING. Tracked.

## RUST_DEV.md

New "Known regressions in retired-Python components" section with a
table per retired component. The intent is that this section gets
updated whenever a regression closes (or a new one is found), so the
duplicate-codebase tax stays visible instead of decaying into folklore.
---
 RUST_DEV.md                                 |  31 ++++
 headroom/transforms/smart_crusher.py        | 175 ++++++++++++++++--
 tests/test_smart_crusher_toin_attachment.py | 188 ++++++++++++++++++++
 3 files changed, 376 insertions(+), 18 deletions(-)
 create mode 100644 tests/test_smart_crusher_toin_attachment.py

diff --git a/RUST_DEV.md b/RUST_DEV.md
index 614ffacc..135b9a7f 100644
--- a/RUST_DEV.md
+++ b/RUST_DEV.md
@@ -159,6 +159,37 @@ The recorder monkey-patches the in-process transform classes (see
 `record_all()` in `tests/parity/recorder.py`). It does **not** modify any
 file under `headroom/`.
 
+## Known regressions in retired-Python components
+
+The Stage 3b/3c.1b retirements deleted Python source for `DiffCompressor`
+and `SmartCrusher` and replaced them with PyO3-delegating shims. The
+2026-04-28 audit found that the retirements shipped with subsystems
+silently disconnected. This section tracks each gap and its disposition
+so they don't regress further or get forgotten.
+
+### SmartCrusher
+
+| Subsystem | State | Tracked by |
+|---|---|---|
+| TOIN learning loop | **Re-attached 2026-04-28.** Shim's `crush()` and `_smart_crush_content()` now call `toin.record_compression()` after a real compression. Filtered on `strategy != "passthrough"` to ignore JSON re-canonicalization. Best-effort: TOIN failures are logged at debug level and don't break compression. | `tests/test_smart_crusher_toin_attachment.py` |
+| CCR marker emission knob | **Open gap.** `ccr_config.inject_retrieval_marker=False` is not honored — the Rust port emits `<<ccr:HASH N_rows_offloaded>>` markers as part of `dropped_summary` unconditionally. Shim now logs a WARNING when callers pass `False`. **Fix needed:** add a `enable_ccr_marker: bool` gate in `crates/headroom-core/src/transforms/smart_crusher/crusher.rs::crush_array`, plumb through `SmartCrusherConfig` and the PyO3 bridge. | This file + warning at `headroom/transforms/smart_crusher.py` |
+| Custom relevance scorer | **Open gap.** `relevance_config` and `scorer` constructor args are accepted for source compatibility but the Rust default `HybridScorer` always runs. Shim now logs WARNING (previously debug). **Fix needed:** expose a Python-bridged scorer constructor surface from `crates/headroom-core/src/relevance/`. | Warning at `headroom/transforms/smart_crusher.py` |
+| Per-tool TOIN learning hook | **Re-attached partially.** `_smart_crush_content` accepts `tool_name` and now threads it into the TOIN record. The hook is best-effort — it improves `query_context` aggregation but doesn't drive per-tool overrides yet. | `tests/test_smart_crusher_toin_attachment.py::test_smart_crush_content_records_to_toin` |
+
+### DiffCompressor
+
+| Subsystem | State |
+|---|---|
+| Adaptive context windows | Honored byte-for-byte (parity fixture-locked). |
+| TOIN integration | Never had one — DiffCompressor records via `_record_to_toin` in ContentRouter, which already runs for non-SmartCrusher strategies. No regression. |
+
+### Watch list (potential regressions, not yet audited)
+
+- `CCRConfig.enabled=False` end-to-end behavior. Currently the Rust port has a CCR store that's controlled by builder selection, not by a config flag. Sometimes-disabled paths haven't been audited.
+- `SmartCrusherConfig.use_feedback_hints=False` — config field is forwarded to Rust but its honoring inside the Rust crusher hasn't been verified against a parity fixture for the disabled path.
+
+When any item above changes, update both this section and the test file. The shim's docstring also references this section — keep them aligned.
+
 ## Phase 0 Blockers
 
 These are known limitations for Phase 0. They are tracked here so Phase 1
diff --git a/headroom/transforms/smart_crusher.py b/headroom/transforms/smart_crusher.py
index 1ae5ac5f..66dddaac 100644
--- a/headroom/transforms/smart_crusher.py
+++ b/headroom/transforms/smart_crusher.py
@@ -18,18 +18,28 @@
 fallback. Build it locally with `scripts/build_rust_extension.sh`
 (wraps `maturin develop`) or install a prebuilt wheel.
 
-Stage 3c.1 deliberately keeps the optional subsystems (TOIN,
-feedback, CCR marker injection, telemetry) disabled in the Rust port.
-The shim accepts `relevance_config`, `scorer`, and `ccr_config`
-constructor args for source compatibility but does not wire them
-through — they re-attach in Stage 3c.2 when those subsystems land in
-Rust. CCR marker injection in `_smart_crush_content` is a Stage 3c.2
-follow-up; today the Rust port never emits CCR markers, so the
-disabled-path behavior is byte-equal.
+# Functionality state (post-audit, 2026-04-28)
+
+- **TOIN learning** — re-attached. `crush()` and `_smart_crush_content`
+  call `toin.record_compression()` after a real compression (filtered on
+  `strategy != "passthrough"` to ignore JSON re-canonicalization).
+  The retired Python class did this inline; the bridge keeps the
+  highest-traffic strategy fueling the learning loop.
+- **CCR marker emission** — partial gap. The Rust port emits
+  `<<ccr:HASH N_rows_offloaded>>` markers unconditionally as part of
+  `dropped_summary`; the `ccr_config.inject_retrieval_marker=False`
+  knob is therefore not honored end-to-end. The shim now logs a
+  WARNING when callers pass `False`. Suppression needs a Rust-side
+  gate; see RUST_DEV.md.
+- **Custom relevance scorer / scorer override** — still not wired.
+  `relevance_config` and `scorer` constructor args are accepted for
+  source compat but the Rust default `HybridScorer` is always used.
+  The shim now logs a WARNING (was: debug) when these are passed.
 """
 
 from __future__ import annotations
 
+import json
 import logging
 from dataclasses import dataclass
 from typing import Any
@@ -159,14 +169,26 @@ def __init__(
         self._with_compaction = with_compaction
 
         # CCR config is preserved on `self` for callers that read it
-        # back (`headroom.proxy.server` does), but the Rust port doesn't
-        # exercise it: Stage 3c.1 keeps CCR marker injection disabled
-        # because the Rust port has no compression store. When Stage
-        # 3c.2 lands the CCR port, this wires through to Rust.
+        # back (`headroom.proxy.server` does). Storage-side semantics
+        # (CCR cache lookups) are honored via the Rust crusher's own
+        # CCR store. *Marker emission* is the gap: the Rust port emits
+        # `<<ccr:HASH N_rows_offloaded>>` markers unconditionally as
+        # part of `dropped_summary`, so `inject_retrieval_marker=False`
+        # has no effect on the prompt today. We log a WARNING (not
+        # debug) so it's visible. Suppression of marker emission needs
+        # a Rust-side gate; tracked in RUST_DEV.md.
         if ccr_config is None:
             self._ccr_config = CCRConfig(enabled=True, inject_retrieval_marker=False)
         else:
             self._ccr_config = ccr_config
+        if not self._ccr_config.inject_retrieval_marker:
+            logger.warning(
+                "SmartCrusher: ccr_config.inject_retrieval_marker=False "
+                "is currently NOT honored by the Rust port — CCR row-drop "
+                "markers (`<<ccr:HASH N_rows_offloaded>>`) will still appear "
+                "in compressed output. Tracked as a known regression in "
+                "RUST_DEV.md until a Rust-side gate lands."
+            )
 
         # `relevance_config` and `scorer` are accepted for source
         # compatibility but currently dropped — Stage 3c.1 ships with
@@ -174,12 +196,18 @@ def __init__(
         # Stage 3c.2 when the relevance crate gains a Python-bridged
         # constructor surface.
         if relevance_config is not None or scorer is not None:
-            logger.debug(
-                "SmartCrusher: relevance_config/scorer args are ignored in "
-                "Stage 3c.1 (Rust port uses default HybridScorer). They "
-                "will be wired through in Stage 3c.2."
+            logger.warning(
+                "SmartCrusher: custom relevance_config/scorer args are "
+                "currently ignored (Rust port uses default HybridScorer). "
+                "Tracked as a known regression in RUST_DEV.md."
             )
 
+        # Lazy TOIN handle. Loaded on first compression that has items
+        # to learn from. Skipping import at __init__ keeps cold-start
+        # fast for environments where telemetry is disabled.
+        self._toin: Any = None
+        self._toin_load_failed = False
+
         # Build the Rust crusher with every field from the Python
         # config, plus the relevance_threshold default (0.3) — the
         # Python dataclass doesn't carry that field; it lives on
@@ -221,6 +249,26 @@ def crush(self, content: str, query: str = "", bias: float = 1.0) -> CrushResult
         working.
         """
         r = self._rust.crush(content, query, bias)
+        # Re-attach the TOIN learning loop. The retired Python class
+        # recorded compressions into TOIN inline; the Rust port doesn't
+        # know about TOIN, and `ContentRouter._record_to_toin` skips
+        # SmartCrusher on the assumption SmartCrusher records its own.
+        # Bridging the gap here keeps JSON-array compressions fueling
+        # the learning system.
+        #
+        # Filter on `was_modified AND strategy != "passthrough"`. The
+        # Rust crusher sometimes flips `was_modified=True` from pure
+        # JSON re-canonicalization (whitespace normalization) without
+        # actually compressing — the strategy stays `"passthrough"` in
+        # that case, and there's no learning value in recording it.
+        if r.was_modified and r.strategy != "passthrough":
+            self._record_to_toin(
+                original=content,
+                compressed=r.compressed,
+                strategy=r.strategy,
+                query_context=query,
+                tool_name=None,
+            )
         return CrushResult(
             compressed=r.compressed,
             original=r.original,
@@ -290,12 +338,103 @@ def _smart_crush_content(
         """Apply smart crushing; return `(crushed, was_modified, info)`.
 
         Mirrors the retired Python method's tuple shape. `tool_name` is
-        accepted for API compatibility and currently ignored — Stage
-        3c.1 has no per-tool TOIN learning hook.
+        threaded through to TOIN's per-tool learning records; if no
+        tool name is available (e.g. the legacy pipeline doesn't have
+        one in scope) the recording uses content-based signature only.
         """
         crushed, was_modified, info = self._rust.smart_crush_content(content, query_context, bias)
+        # Same passthrough filter as `crush()` — re-canonicalization of
+        # JSON whitespace can flip `was_modified=True` even when the
+        # `info` field reports `passthrough` and no compression happened.
+        if was_modified and info != "passthrough":
+            self._record_to_toin(
+                original=content,
+                compressed=crushed,
+                strategy=info or "smart_crusher",
+                query_context=query_context,
+                tool_name=tool_name,
+            )
         return crushed, was_modified, info
 
+    def _record_to_toin(
+        self,
+        original: str,
+        compressed: str,
+        strategy: str,
+        query_context: str,
+        tool_name: str | None,
+    ) -> None:
+        """Record a successful compression into TOIN's learning store.
+
+        Replaces the inline TOIN call the retired Python SmartCrusher
+        had at the end of its compression path. Best-effort: TOIN
+        failures are logged at debug level and never bubble — the
+        compression itself has already happened and is correct.
+
+        Token estimates use the `len(json) // 4` rule the retired
+        implementation used. The router doesn't pass a tokenizer down
+        this far, and re-tokenizing here would dominate the recording
+        cost. Rough estimates are fine for learning aggregates.
+        """
+        if self._toin_load_failed:
+            return
+        try:
+            try:
+                items = json.loads(original)
+            except (json.JSONDecodeError, ValueError):
+                # Not JSON — nothing structural for TOIN to learn from
+                # at the array level. The Rust crusher only sets
+                # `was_modified=True` on JSON-array inputs, so this
+                # branch is rare; bail quietly.
+                return
+            if not isinstance(items, list):
+                return
+
+            from ..telemetry.models import ToolSignature
+
+            signature = ToolSignature.from_items(items)
+            original_tokens = max(1, len(original) // 4)
+            compressed_tokens = max(1, len(compressed) // 4)
+
+            if self._toin is None:
+                from ..telemetry.toin import get_toin
+
+                self._toin = get_toin()
+
+            # Extract the kept-row count from the compressed payload
+            # when possible. The lossy path emits a JSON array with a
+            # `_ccr_dropped` sentinel suffix; the lossless path emits
+            # CSV-schema or compact JSON. For the array case we get an
+            # exact compressed_count; otherwise fall back to the rough
+            # `original_count` (TOIN cares more about structural
+            # signature than count precision).
+            try:
+                compressed_parsed = json.loads(compressed)
+                compressed_count = (
+                    len(strip_ccr_sentinels(compressed_parsed))
+                    if isinstance(compressed_parsed, list)
+                    else len(items)
+                )
+            except (json.JSONDecodeError, ValueError):
+                compressed_count = len(items)
+
+            self._toin.record_compression(
+                tool_signature=signature,
+                original_count=len(items),
+                compressed_count=compressed_count,
+                original_tokens=original_tokens,
+                compressed_tokens=compressed_tokens,
+                strategy=strategy,
+                query_context=query_context if query_context else None,
+                items=items[:5],  # Sample for field-level learning
+            )
+        except ImportError:
+            # TOIN module not installed in this build — disable for
+            # the lifetime of this crusher to avoid retry overhead.
+            self._toin_load_failed = True
+        except Exception as e:  # pragma: no cover - best effort
+            logger.debug("SmartCrusher TOIN recording failed (non-fatal): %s", e)
+
     def _extract_context_from_messages(self, messages: list[dict[str, Any]]) -> str:
         """Build a query string from the last 5 user messages + recent
         assistant tool-call arguments. Used by `apply()` to derive the
diff --git a/tests/test_smart_crusher_toin_attachment.py b/tests/test_smart_crusher_toin_attachment.py
new file mode 100644
index 00000000..a7ad2164
--- /dev/null
+++ b/tests/test_smart_crusher_toin_attachment.py
@@ -0,0 +1,188 @@
+"""TOIN re-attachment regression tests for the Rust-backed SmartCrusher.
+
+When SmartCrusher's Python implementation was retired in Stage 3c.1b,
+its inline `toin.record_compression()` call was lost. The Rust port
+doesn't know about TOIN, and `ContentRouter._record_to_toin` skips
+SmartCrusher on the assumption SmartCrusher records its own. The net
+result was a silent regression: JSON-array compressions — the
+highest-traffic strategy — stopped feeding the learning loop.
+
+The shim now bridges the gap. These tests assert that:
+
+1. Calling `SmartCrusher.crush(...)` on a JSON array of dicts produces a
+   `record_compression` event in TOIN when the input is large enough to
+   actually compress.
+2. Pass-through inputs (no modification) do NOT record — TOIN only
+   learns from real compression events.
+3. The recorded `tool_signature.structure_hash` is computed over the
+   parsed items, so two compressions of structurally-similar inputs
+   land on the same pattern.
+4. Calling `_smart_crush_content(...)` (the legacy `apply()` path) also
+   records.
+5. TOIN failures are non-fatal — compression completes even if the
+   telemetry import or call raises.
+"""
+
+from __future__ import annotations
+
+import tempfile
+from pathlib import Path
+from typing import Any
+from unittest.mock import patch
+
+import pytest
+
+from headroom.telemetry.toin import TOINConfig, get_toin, reset_toin
+from headroom.transforms.smart_crusher import SmartCrusher, SmartCrusherConfig
+
+
+@pytest.fixture
+def fresh_toin():
+    reset_toin()
+    with tempfile.TemporaryDirectory() as tmpdir:
+        storage = str(Path(tmpdir) / "toin.json")
+        toin = get_toin(
+            TOINConfig(
+                storage_path=storage,
+                auto_save_interval=0,
+            )
+        )
+        yield toin
+        reset_toin()
+
+
+def _bigger_array(n: int = 60) -> str:
+    """JSON array of `n` dicts, large enough to trigger crushing.
+
+    Items use a low-uniqueness shape (`{"status": "ok", "tag": "x"}`)
+    so the analyzer recommends compaction or row drops. We need at
+    least 200 tokens (`min_tokens_to_crush` default) to enter the
+    crusher; 60 items keeps us well above that.
+    """
+    items = [{"status": "ok", "tag": "x", "n": i} for i in range(n)]
+    import json as _json
+
+    return _json.dumps(items)
+
+
+# ─── crush() path (router-style) ───────────────────────────────────────
+
+
+def test_crush_records_to_toin_on_modification(fresh_toin):
+    crusher = SmartCrusher(SmartCrusherConfig())
+    payload = _bigger_array(60)
+    pre = sum(p.total_compressions for p in fresh_toin._patterns.values())
+
+    result = crusher.crush(payload, query="test query", bias=1.0)
+
+    if not result.was_modified:
+        pytest.skip("payload didn't trigger compression — bump the size")
+    post = sum(p.total_compressions for p in fresh_toin._patterns.values())
+    assert post > pre, "TOIN should have recorded a compression event"
+
+
+def test_crush_does_not_record_on_passthrough(fresh_toin):
+    """A small input the analyzer doesn't compress should produce no
+    TOIN recording, even if the Rust port flipped `was_modified=True`
+    from JSON whitespace re-canonicalization. The strategy stays
+    `passthrough` in that case and we filter on it."""
+    crusher = SmartCrusher(SmartCrusherConfig())
+    payload = '[{"id": 1}]'  # Single item — below min_items_to_analyze.
+    pre = sum(p.total_compressions for p in fresh_toin._patterns.values())
+
+    result = crusher.crush(payload, query="", bias=1.0)
+    assert result.strategy == "passthrough"
+    post = sum(p.total_compressions for p in fresh_toin._patterns.values())
+    assert post == pre, "no recording when strategy is passthrough"
+
+
+def test_crush_signature_groups_similar_inputs(fresh_toin):
+    """Two structurally-similar payloads should record under the same
+    tool signature so TOIN can aggregate the pattern across calls."""
+    crusher = SmartCrusher(SmartCrusherConfig())
+
+    payload_a = _bigger_array(60)
+    payload_b = _bigger_array(80)
+
+    crusher.crush(payload_a, query="", bias=1.0)
+    crusher.crush(payload_b, query="", bias=1.0)
+
+    if not fresh_toin._patterns:
+        pytest.skip("neither payload compressed — bump the size")
+    # Both share field shape {status, tag, n} → same structure hash →
+    # one pattern with at least 2 recordings.
+    pattern_counts = {h: p.total_compressions for h, p in fresh_toin._patterns.items()}
+    assert max(pattern_counts.values()) >= 2, (
+        f"expected the same pattern to be recorded twice, got {pattern_counts}"
+    )
+
+
+# ─── _smart_crush_content() path (legacy apply()) ──────────────────────
+
+
+def test_smart_crush_content_records_to_toin(fresh_toin):
+    crusher = SmartCrusher(SmartCrusherConfig())
+    payload = _bigger_array(60)
+    pre = sum(p.total_compressions for p in fresh_toin._patterns.values())
+
+    crushed, was_modified, _info = crusher._smart_crush_content(
+        payload, query_context="user query", tool_name="get_records", bias=1.0
+    )
+
+    if not was_modified:
+        pytest.skip("payload didn't trigger compression")
+    post = sum(p.total_compressions for p in fresh_toin._patterns.values())
+    assert post > pre
+
+
+# ─── Failure modes ─────────────────────────────────────────────────────
+
+
+def test_toin_failure_does_not_break_compression(fresh_toin):
+    """If TOIN's `record_compression` raises, compression still
+    completes and returns a valid CrushResult. Telemetry is best-
+    effort; the request must not fail."""
+    crusher = SmartCrusher(SmartCrusherConfig())
+    payload = _bigger_array(60)
+
+    def _boom(*_a: Any, **_kw: Any) -> None:
+        raise RuntimeError("simulated TOIN backend down")
+
+    with patch.object(fresh_toin, "record_compression", side_effect=_boom):
+        result = crusher.crush(payload, query="", bias=1.0)
+        assert isinstance(result.compressed, str)
+        # Crusher itself should still report a result regardless of
+        # whether the underlying analysis chose to modify.
+
+
+def test_non_json_input_does_not_record(fresh_toin):
+    """Non-JSON input shouldn't blow up — it just doesn't record.
+    The Rust crusher returns `was_modified=False` for non-arrays;
+    even if it didn't, the helper guards against `json.loads`
+    failure."""
+    crusher = SmartCrusher(SmartCrusherConfig())
+    pre = sum(p.total_compressions for p in fresh_toin._patterns.values())
+
+    result = crusher.crush("not json at all", query="", bias=1.0)
+    assert result.was_modified is False  # passthrough
+    post = sum(p.total_compressions for p in fresh_toin._patterns.values())
+    assert post == pre
+
+
+# ─── CCR marker knob ───────────────────────────────────────────────────
+
+
+def test_ccr_inject_marker_false_logs_warning(caplog):
+    """Until the Rust port honors the marker-suppression flag, we want a
+    visible warning when callers ask for it. This guards against the
+    silent regression that started this audit."""
+    from headroom.config import CCRConfig
+
+    with caplog.at_level("WARNING", logger="headroom.transforms.smart_crusher"):
+        SmartCrusher(
+            SmartCrusherConfig(),
+            ccr_config=CCRConfig(enabled=True, inject_retrieval_marker=False),
+        )
+    assert any("inject_retrieval_marker=False" in rec.getMessage() for rec in caplog.records), (
+        "expected a WARNING about the unsupported marker-suppression flag"
+    )

From cf04c8ac78835691a92b94a10abd2a7215c711e6 Mon Sep 17 00:00:00 2001
From: chopratejas <chopratejas@gmail.com>
Date: Tue, 28 Apr 2026 13:49:16 -0700
Subject: [PATCH 2/2] test(smart_crusher): make CCR-warning test resilient to
 logging pollution
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

`caplog` flakes under the full suite — earlier tests can leave
`logging.disable()` or root-handler state that suppresses records on
the named logger before pytest's caplog filter sees them. The test
passed in isolation but failed deterministically once the larger
suite preceded it (CI run 25070639932 across all four Python
versions, ~4500 tests in).

Fix: monkeypatch `headroom.transforms.smart_crusher.logger.warning`
directly. Bypasses every level/disable/handler concern; we only
assert the constructor *called* `logger.warning` with the expected
flag name in the message — which is exactly the contract this test
guards.
---
 tests/test_smart_crusher_toin_attachment.py | 35 ++++++++++++++++-----
 1 file changed, 27 insertions(+), 8 deletions(-)

diff --git a/tests/test_smart_crusher_toin_attachment.py b/tests/test_smart_crusher_toin_attachment.py
index a7ad2164..25e79726 100644
--- a/tests/test_smart_crusher_toin_attachment.py
+++ b/tests/test_smart_crusher_toin_attachment.py
@@ -172,17 +172,36 @@ def test_non_json_input_does_not_record(fresh_toin):
 # ─── CCR marker knob ───────────────────────────────────────────────────
 
 
-def test_ccr_inject_marker_false_logs_warning(caplog):
+def test_ccr_inject_marker_false_logs_warning(monkeypatch):
     """Until the Rust port honors the marker-suppression flag, we want a
     visible warning when callers ask for it. This guards against the
-    silent regression that started this audit."""
+    silent regression that started this audit.
+
+    Why monkeypatch instead of `caplog`: `caplog` flakes under the full
+    suite — earlier tests can leave `logging.disable()` or root-handler
+    state that suppresses the named logger's records before the
+    test's filter sees them. Patching `logger.warning` on the module
+    bypasses every level/disable/handler concern; we only assert that
+    the constructor *called* `logger.warning` with the flag name in
+    the message.
+    """
     from headroom.config import CCRConfig
+    from headroom.transforms import smart_crusher as sc_module
 
-    with caplog.at_level("WARNING", logger="headroom.transforms.smart_crusher"):
-        SmartCrusher(
-            SmartCrusherConfig(),
-            ccr_config=CCRConfig(enabled=True, inject_retrieval_marker=False),
-        )
-    assert any("inject_retrieval_marker=False" in rec.getMessage() for rec in caplog.records), (
+    captured: list[str] = []
+    real_warning = sc_module.logger.warning
+
+    def _capture(msg: str, *args: object, **kwargs: object) -> None:
+        captured.append(msg % args if args else msg)
+        real_warning(msg, *args, **kwargs)
+
+    monkeypatch.setattr(sc_module.logger, "warning", _capture)
+
+    SmartCrusher(
+        SmartCrusherConfig(),
+        ccr_config=CCRConfig(enabled=True, inject_retrieval_marker=False),
+    )
+
+    assert any("inject_retrieval_marker=False" in m for m in captured), (
         "expected a WARNING about the unsupported marker-suppression flag"
     )