Promote develop to main — PR1+PR2 token engine#13
Closed
claudioemmanuel wants to merge 28 commits intomainfrom
Closed
Promote develop to main — PR1+PR2 token engine#13claudioemmanuel wants to merge 28 commits intomainfrom
claudioemmanuel wants to merge 28 commits intomainfrom
Conversation
Merge README updates into develop
…tted; add PR_CREATION_TOKEN guidance
Merge promote workflow fix into develop
…gates, libc unix-only
feat: add Windows (Git Bash) support
…cy cache, summarize fallback PR1 of two. Adds the new context optimization layer (src/context/) that sits between filter::compress and wrap.rs, attacking three sources of token waste that previously went unaddressed: - Adaptive intensity: bash header now reports the active level (Lite/Full/Ultra) and per-handler limits scale automatically as cumulative session usage approaches the compact_threshold budget. Floors enforced so we never reduce to zero. - Cross-call redundancy cache: the same compressed output within the last 8 calls is collapsed to a single reference line. Length-equality guarded against FNV-1a collisions; tiny outputs (<5 lines) skipped. - Summarize fallback: raw outputs over 500 lines (configurable) are replaced with a dense ≤40-line summary (top errors, top files, test summary, last 20 lines verbatim) instead of running through the per-handler truncation pipeline. - SessionContext persists at sessions/context.json next to current.json with bounded ring buffers (32 calls, 256 files, 128 errors, 64 git refs). Hand-rolled flat-array JSON via the existing json_util — no serde, zero new dependencies. - Cross-call hint: cat/head/tail/less/more/bat of a file already in context emits "# squeez hint: <path> already in context (Read tool, call #N)" without blocking execution. All new behavior is opt-out via four new config keys (adaptive_intensity, context_cache_enabled, redundancy_cache_enabled, summarize_threshold_lines). Existing handlers and strategies are untouched; the wrap.rs diff is ~30 lines. Tests: 167 passing (was 132). New integration tests cover intensity boundaries, cache round-trip, redundancy hit/miss, summarize fallback. Inline unit tests in each context module add ~25 more cases. Benches: bench/run.sh stays at 12/12; new bench/run_context.sh exercises the wrap.rs pre/post pass end-to-end (3/3 passing). CI workflow runs both.
User requested maximum aggression by default. The Lite/Full tiers remain in the enum (forward-compat) but derive() now returns Ultra unconditionally when adaptive_intensity is enabled. To opt out of scaling entirely, set adaptive_intensity = false (falls back to Lite).
PR2 of two. Stacks on the context engine from PR1 and lands the user-facing features that make squeez attack input + output + memory-file tokens at maximum aggression by default. Highlights: - squeez compress-md — pure-Rust, zero-LLM caveman-style markdown compressor. Line-based state machine preserves fenced code blocks, inline code, URLs (bare + markdown link targets), headings, tables, list markers, and version strings; only natural-language prose is compressed. Drops articles + fillers + pleasantries + hedging everywhere. --ultra additionally substitutes with→w/, function→fn, configuration→config, repository→repo, etc. Damage heuristic aborts the write if any code block, URL, or heading was lost. Backups go to <stem>.original.md and are never overwritten. - Caveman persona — three intensity levels (lite/full/ultra) shipped via include_str!. Default is Ultra. Injected into the squeez init banner (Claude Code) and into the <!-- squeez:start --> block in ~/.copilot/copilot-instructions.md (Copilot CLI). - auto_compress_md=true — squeez init runs compress-md --all silently on every session start. Idempotent: backup is only written once and the integrity check guarantees safety. - squeez update — self-updater. Shells out to curl + sha256sum/shasum (both already required by install.sh) so we stay zero-dep. Atomic on Unix via .new + rename; documented .new + manual move on Windows. Override base URL via SQUEEZ_UPDATE_URL_OVERRIDE for tests. --check reports without installing; --insecure skips checksum. - squeez track-result + posttooluse hook update — PostToolUse now pipes the raw JSON for every tool call (Read, Grep, LS, Glob, ...) into squeez track-result, which extracts file paths + errors from the result content and feeds them into SessionContext. Future bash calls dedup against this state via the cross-call hint added in PR1. - Config additions: persona (default Ultra), auto_compress_md (default true). Both opt-out via config.ini. Tests: 230 passing (was 162). New integration tests for compress_md (13), persona (8), track_result (6), update (8). New mdcompress_* bench fixtures show ~22-27% reduction (caveman LLM gets ~45% with API calls; we get ~25% with zero API calls in <5ms).
feat(context): aggressive token-engine — adaptive intensity, redundancy cache, summarize fallback
…update feat(pr2): compress-md, caveman persona, squeez update, track-result hook
- Updated "What it does" with context engine, persona, compress-md, track-result - Added summarize_huge and intensity_budget80 rows to benchmark table (14/14 pass) - Updated changelog with 2026-04-07 entries for all new features - Updated PostToolUse description to reflect full artifact extraction
…ken budget - jq/yq/terraform/tofu/helm/pulumi: DataToolHandler (terraform: strip unchanged resources, keep +/-/~ and Plan: lines; helm: strip boilerplate; jq/yq: drop empty/null lines + truncate) - grep/rg/awk/sed: TextProcHandler (collapse files with ≥5 matches to summary line; dedup + truncate) - podman, nextest, az: added to filter.rs routing table - RECENT_WINDOW: 8→16 (catch repeated outputs in longer call sequences) - MIN_LINES: 5→2 (short outputs like `git status` 2-liner now eligible for dedup) - compact_threshold_tokens default: 160K→120K (50K safety margin for sonnet-4-6 200K context; avoids truncation from protocol overhead + system prompt) - Per-tool token breakdown in compact warning: Bash/Read/Other separate counters persisted to context.json via SessionContext.note_tool_tokens() - README: deduplicated benchmark header, updated config defaults, handler table, redundancy window and compact warning documentation
fix: resolve gap analysis — new handlers, wider redundancy window, token budget
Owner
Author
|
Superseded — promoting develop directly with all gap fixes included. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Lands the full squeez token-optimizer roadmap (PR1 + PR2) on main:
squeez compress-md— pure-Rust markdown compressor; Ultra mode with abbreviations; damage heuristic; auto-runs on every session startsqueez update— self-updater via curl + SHA-256 verification; atomic installtrack-resulthook — PostToolUse Read/Grep/Glob results feed SessionContextBenchmarks (macOS Apple Silicon)
ps auxcargo build(Ultra)git log200 commitsdocker logs14/14 fixtures pass. All latency ≤15ms.
Test plan
cargo test— 18 integration tests pass (+ comprehensive unit suite)bash bench/run.sh— 14/14 fixturesbash bench/run_context.sh— 3/3 context scenariosCloses #10 Closes #12