feat(local-executor): add opt-in sandbox posture to LocalCommandLineCodeExecutor#7598
Draft
xr843 wants to merge 1 commit intomicrosoft:mainfrom
Draft
feat(local-executor): add opt-in sandbox posture to LocalCommandLineCodeExecutor#7598xr843 wants to merge 1 commit intomicrosoft:mainfrom
xr843 wants to merge 1 commit intomicrosoft:mainfrom
Conversation
…odeExecutor Refs microsoft#7462. Supersedes closed PR microsoft#7467. The legacy `UserWarning` at construction was easily suppressed by production warning filters and `python -W ignore`, leaving unsandboxed execution of LLM-generated code as the silent default. This change introduces an explicit three-state sandbox posture parameter: - sandbox=None (default, legacy behavior for backward compatibility): DeprecationWarning + logger.warning() surface the risk in both Python warning channels and structured logging pipelines. A future release will make this parameter required. - sandbox=False Caller explicitly acknowledges unsandboxed execution; no warning is emitted. - sandbox=True Best-effort in-process hardening: * Environment entries whose name contains credential patterns (TOKEN, SECRET, API_KEY, PASSWORD, PRIVATE_KEY, CREDENTIAL, SESSION, COOKIE, AUTH) are stripped from the child process. * On POSIX, per-child rlimits (RLIMIT_CPU, RLIMIT_AS, RLIMIT_NOFILE, RLIMIT_NPROC) are applied via preexec_fn so runaway memory/fork-bomb payloads are capped. * On Windows, env scrub applies but preexec is unavailable; a UserWarning directs callers to the Docker executor for strong isolation. Docstring and LocalCommandLineCodeExecutorConfig are updated to round-trip the posture through serialization so declarative deployments cannot silently downgrade. This is NOT a substitute for DockerCommandLineCodeExecutor — adversarial payloads can still read files, make outbound connections, and write to work_dir. The docstring states this explicitly. Tests cover: default DeprecationWarning, explicit opt-out silence, env scrubbing on POSIX, and config round-trip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Refs #7462. Supersedes closed #7467.
Adds an explicit three-state
sandboxparameter toLocalCommandLineCodeExecutorso callers cannot accidentally rely on an easily-suppressedUserWarningas their only signal that LLM-generated code is running unsandboxed.sandboxNone(default, legacy)DeprecationWarningandlogger.warning(). Execution unchanged — fully backward compatible. In a future release this parameter becomes required.FalseTruesandbox=TruehardeningTOKEN,SECRET,API_KEY,PASSWORD,PRIVATE_KEY,CREDENTIAL,SESSION,COOKIE,AUTH) are removed from the child process — the most common path by which LLM-generated code exfiltrates provider keys.preexec_fn:RLIMIT_CPU,RLIMIT_AS,RLIMIT_NOFILE,RLIMIT_NPROCcap runaway memory and fork-bomb payloads.preexec_fnis unavailable → env scrub applies but aUserWarningdirects callers toDockerCommandLineCodeExecutorfor strong isolation.This is NOT a substitute for
DockerCommandLineCodeExecutor. The docstring says so verbatim. Adversarial payloads can still read files, call out to the network, and write withinwork_dir. The opt-in flag exists to:Backward compatibility
sandbox: Optional[bool] = None).LocalCommandLineCodeExecutorConfiggains the same field so serialization round-trips the posture — a declarative deployment cannot silently downgrade to the default warning path.DeprecationWarningwith actionable guidance.Threading to the issue's open questions
sandboxis the gate.logger.warning()fires every instantiation in the default path. A dedicated telemetry hook could follow as a separate PR if maintainers want it.OAI_CONFIG_LISTdetails in documentation #3 (README delta): intentionally not in this PR to keep the diff focused on code + tests. Happy to add in a follow-up once this direction is acked.Test plan
Four new tests in
test_commandline_code_executor.py:test_sandbox_default_emits_deprecation_warning— default path raisesDeprecationWarningtest_sandbox_false_is_silent_opt_out— explicit acknowledgement emits nothingtest_sandbox_true_strips_credential_env(POSIX-only) — subprocess sees<missing>forMY_API_KEY/SOME_TOKEN,HARMLESS_VARsurvivestest_sandbox_roundtrips_through_config—dump_component/load_componentpreserve the posture and don't re-trigger the default warningLocal run:
pytest tests/code_executors/test_commandline_code_executor.py→ 17 passed, 1 skipped (pre-existing venv skip).ruff check,ruff format --check, andmypyall clean on the touched files.Follow-ups (not in scope here)
OAI_CONFIG_LISTdetails in documentation #3).RLIMIT_FSIZEceiling onwork_dirwrite size.Happy to split this into two PRs (warning upgrade vs.
sandbox=Truehardening) if maintainers prefer smaller reviews. Drafting so the direction can be sanity-checked before polish on the Windows path.