feat(context): aggressive token-engine — adaptive intensity, redundancy cache, summarize fallback by claudioemmanuel · Pull Request #10 · claudioemmanuel/squeez

claudioemmanuel · 2026-04-07T03:35:25Z

Summary

PR1 of two implementing the squeez token-optimizer roadmap. Adds the context optimization layer (src/context/) — a minimally-invasive engine that sits between filter::compress and wrap.rs and attacks three sources of token waste squeez previously ignored.

Adaptive intensity — bash header now reports the active level (Lite/Full/Ultra) and per-handler limits (max_lines, dedup_min, git_diff_max, docker_logs_max, find_max, summarize_threshold) automatically scale as cumulative session usage approaches the compact_threshold budget. Floors enforced so we never reduce to zero.
Cross-call redundancy cache — the same compressed output within the last 8 calls is collapsed to a single reference line: [squeez: identical to <hash> at bash#<n> — re-run with --no-squeez]. Length-equality guarded against FNV-1a collisions; tiny outputs (<5 lines) skipped.
Summarize fallback — raw outputs over 500 lines (configurable) are replaced with a dense ≤40-line summary (top errors, top files, test summary, last 20 lines verbatim) instead of running through per-handler truncation.
SessionContext persists at sessions/context.json next to current.json with bounded ring buffers (32 calls, 256 files, 128 errors, 64 git refs). Hand-rolled flat-array JSON — no serde, zero new dependencies (still just libc on Unix).
Cross-call hint — cat/head/tail/less/more/bat of a file already in context emits a one-line hint without blocking execution.

All new behavior is opt-out via four new config keys (adaptive_intensity, context_cache_enabled, redundancy_cache_enabled, summarize_threshold_lines). Existing handlers and strategies are untouched; the wrap.rs diff is ~30 lines.

Changes

New src/context/ module: intensity.rs, cache.rs, redundancy.rs, summarize.rs, hash.rs (FNV-1a 64), mod.rs
src/commands/wrap.rs: pre/post-pass insertion + [adaptive: <Level>] header tag
src/config.rs: 4 new fields with defaults + INI parser arms
src/json_util.rs: extract_u64_array, u64_array, usize_array helpers
bench/fixtures/: summarize_huge.txt, intensity_budget80.txt, context_crosscall_{1,2,3}.txt
bench/run_context.sh: end-to-end wrap-mode bench (3 scenarios)
bench/run.sh: skips context_crosscall_* (handled by run_context.sh)
.github/workflows/ci.yml: runs bench/run_context.sh after bench/run.sh
README.md: documents the four new config keys + intensity model

Test plan

cargo test — 167 passing (was 132)
cargo build --release — clean
bench/run.sh — 12/12 fixtures pass (incl. new summarize_huge 100% reduction, intensity_budget80 99% reduction)
bench/run_context.sh — 3/3 scenarios pass:
- summarize_huge triggers summary header, output ≤60 lines
- intensity_budget80 with seeded current.json shows [adaptive: Ultra]
- context_crosscall_{2,3} emit redundancy reference lines after _1
CI workflow updated to run both bench scripts

Risks & mitigations

Risk	Mitigation
FNV-1a 64 collision	exact-length guard + scoped to 32-entry ring per session
Cross-call dedup hides important re-output	`RECENT_WINDOW=8`; opt-out via `redundancy_cache_enabled=false`
Adaptive intensity surprises users	header always shows `[adaptive: <Level>]`
Zero new deps	confirmed: only `libc` (Unix-only) in Cargo.toml

PR2 (memory compressor + caveman persona + squeez update + track-result hook) will follow on top of this branch.

🤖 Generated with Claude Code

…cy cache, summarize fallback PR1 of two. Adds the new context optimization layer (src/context/) that sits between filter::compress and wrap.rs, attacking three sources of token waste that previously went unaddressed: - Adaptive intensity: bash header now reports the active level (Lite/Full/Ultra) and per-handler limits scale automatically as cumulative session usage approaches the compact_threshold budget. Floors enforced so we never reduce to zero. - Cross-call redundancy cache: the same compressed output within the last 8 calls is collapsed to a single reference line. Length-equality guarded against FNV-1a collisions; tiny outputs (<5 lines) skipped. - Summarize fallback: raw outputs over 500 lines (configurable) are replaced with a dense ≤40-line summary (top errors, top files, test summary, last 20 lines verbatim) instead of running through the per-handler truncation pipeline. - SessionContext persists at sessions/context.json next to current.json with bounded ring buffers (32 calls, 256 files, 128 errors, 64 git refs). Hand-rolled flat-array JSON via the existing json_util — no serde, zero new dependencies. - Cross-call hint: cat/head/tail/less/more/bat of a file already in context emits "# squeez hint: <path> already in context (Read tool, call #N)" without blocking execution. All new behavior is opt-out via four new config keys (adaptive_intensity, context_cache_enabled, redundancy_cache_enabled, summarize_threshold_lines). Existing handlers and strategies are untouched; the wrap.rs diff is ~30 lines. Tests: 167 passing (was 132). New integration tests cover intensity boundaries, cache round-trip, redundancy hit/miss, summarize fallback. Inline unit tests in each context module add ~25 more cases. Benches: bench/run.sh stays at 12/12; new bench/run_context.sh exercises the wrap.rs pre/post pass end-to-end (3/3 passing). CI workflow runs both. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

User requested maximum aggression by default. The Lite/Full tiers remain in the enum (forward-compat) but derive() now returns Ultra unconditionally when adaptive_intensity is enabled. To opt out of scaling entirely, set adaptive_intensity = false (falls back to Lite). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(context): aggressive token-engine — adaptive intensity, redundancy cache, summarize fallback

claudioemmanuel and others added 2 commits April 7, 2026 00:34

claudioemmanuel mentioned this pull request Apr 7, 2026

feat(pr2): compress-md, caveman persona, squeez update, track-result hook #11

Closed

8 tasks

claudioemmanuel merged commit 8cb9093 into develop Apr 7, 2026
2 checks passed

claudioemmanuel deleted the feat/context-engine branch April 7, 2026 03:56

claudioemmanuel mentioned this pull request Apr 7, 2026

feat(pr2): compress-md, caveman persona, squeez update, track-result hook #12

Merged

6 tasks

claudioemmanuel added a commit that referenced this pull request Apr 7, 2026

Merge pull request #10 from claudioemmanuel/feat/context-engine

910627e

feat(context): aggressive token-engine — adaptive intensity, redundancy cache, summarize fallback

This was referenced Apr 7, 2026

Promote develop to main — PR1+PR2 token engine #13

Closed

Promote develop to main — full token optimizer (PR1+PR2+gaps) #16

Merged

claudioemmanuel added a commit that referenced this pull request Apr 7, 2026

Merge pull request #10 from claudioemmanuel/feat/context-engine

a62fbe5

feat(context): aggressive token-engine — adaptive intensity, redundancy cache, summarize fallback

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(context): aggressive token-engine — adaptive intensity, redundancy cache, summarize fallback#10

feat(context): aggressive token-engine — adaptive intensity, redundancy cache, summarize fallback#10
claudioemmanuel merged 2 commits intodevelopfrom
feat/context-engine

claudioemmanuel commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

claudioemmanuel commented Apr 7, 2026

Summary

Changes

Test plan

Risks & mitigations

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant