Skip to content

SY-4159: Arc String Formatting#2310

Open
sy-nico wants to merge 22 commits into
rcfrom
sy-4159-string-formatting-v2
Open

SY-4159: Arc String Formatting#2310
sy-nico wants to merge 22 commits into
rcfrom
sy-4159-string-formatting-v2

Conversation

@sy-nico
Copy link
Copy Markdown
Contributor

@sy-nico sy-nico commented May 9, 2026

Issue Pull Request

Linear Issue

SY-4159

Description

Adds {...} placeholder interpolation to Arc backtick raw string literals, with optional %spec format codes for numeric values and \{ / \} escapes for literal braces. Placeholders work inline:

x := `hello, {name}`
y := `pi is {3.14159 % .2f}`         // "pi is 3.14"
z := `count: {(sensor + offset)%05d}`
w := `literal braces: \{ and \}`     // "literal braces: { and }"

and in flow form, where the placeholder body is registered as a synthetic function activated by an upstream trigger:

time.interval{1s} -> `value is {sensor}` -> log

The feature spans five layers:

  • Parser: no grammar change. The existing STR_LITERAL_RAW token now allows {...} and \{ / \} inside its body.
  • Analyzer: a new fmtstring package owns placeholder parsing, spec splitting, and numeric spec validation. AnalyzeStringFmtLiteral walks each placeholder, type checks the inner expression, and anchors per-placeholder diagnostics on the specific {...} span via a new Position.Advance / Diagnostic.WithRange helper pair on x/go/diagnostics.
  • Compiler: inline literals lower to a concat chain over EmitNumericToString / EmitNumericFormat host calls (arc/go/compiler/expression/string_fmt.go). Flow form lowers to a synthetic zero input WASM function gated by the fmt$ IR prefix (arc/go/compiler/string_fmt.go).
  • Runtime: new format_i32/u32/i64/u64/f32/f64 host functions in arc/go/stl/strings read the spec bytes from WASM memory and call fmt.Sprintf. The existing from_* helpers were collapsed onto generic registerFrom / registerFormat factories.
  • Console LSP: expandRawStringPlaceholders delegates to fmtstring.Parse and emits semantic tokens for placeholder braces, the inner expression, and the %spec tail. A new stringPlaceholder color is registered. A small Monaco trigger in Editor.tsx works around monaco-vscode-api ignoring the language tokenTypes config so completions still fire inside {...}.

The fmtstring package is the single source of truth for placeholder position information. The previous hand rolled scanner in the LSP has been deleted, eliminating the drift hazard whenever escape or spec syntax evolves.

Basic Readiness

  • I have performed a self review of my code.
  • I have added relevant, automated tests to cover the changes.
  • I have updated documentation to reflect the changes. (Format spec table & distinction between backtick and quotation literals)

Greptile Summary

This PR adds {...} placeholder interpolation to Arc backtick raw string literals, spanning the parser, analyzer (fmtstring package), compiler (inline concat chain and synthetic WASM functions), runtime (new format_* host functions), and Console LSP (semantic tokens + Monaco completion trigger).

  • fmtstring package is the single source of truth for placeholder parsing; it uses : as the spec separator (e.g. {x:.2f}), handles \{ escape for literal braces, and validates specs via a regex shape check plus a %!-detection probe, with a correct blacklist for verbs like v, T, and U.
  • Compiler path correctly lowers inline literals to a concat chain over from_*/format_* host calls and lowers flow-form literals to synthetic zero-input fmt$-prefixed IR functions; the compileNumericLiteral guard that now also resets a non-numeric hint prevents string-context type leakage into numeric literal compilation.
  • Runtime refactors from_* registration to generic registerFrom/registerFormat helpers and adds format_string; formatWithSpec defensively returns "" on nil-memory or failed Read, and the format_string host function silently discards the s.Get error — both silent empty-string fallbacks are hard to diagnose if a compiler bug produces a bad pointer or handle.

Confidence Score: 5/5

Safe to merge; all findings are defensive-programming gaps that only surface on impossible-in-practice inputs (nil WASM memory, invalid string handle) or malformed format strings already rejected by the flow analyzer.

The feature spans five well-tested layers and the core paths — parsing, type checking, spec validation, WASM emission, and LSP tokenization — are all correct. The three flagged items are silent empty-string returns in runtime host functions (reachable only via a compiler bug) and a missing diagnostic guard in a code path the flow analyzer already covers earlier. No correct program is affected.

arc/go/stl/strings/string.go (silent failure in formatWithSpec and format_string) and arc/go/text/analyze.go (missing error propagation for fmtstring.Parse).

Important Files Changed

Filename Overview
arc/go/fmtstring/fmtstring.go New single-source-of-truth package for format-string parsing; spec separator is :, not % as in the PR description. Logic is well-tested with comprehensive edge-case coverage.
arc/go/stl/strings/string.go Refactors from_* registration to generic helpers and adds format_* host functions. Silent empty-string returns in formatWithSpec and ignored Get error in format_string are defensive but opaque failure paths.
arc/go/compiler/expression/string_fmt.go New file lowering format-string segments to WASM; correctly separates literal, numeric-with-spec, string-with-spec, and no-spec paths.
arc/go/text/analyze.go Adds synthetic IR function generation for raw strings with placeholders in flow context; silently swallows fmtstring.Parse errors without reporting diagnostics.
arc/go/analyzer/expression/string_fmt.go New file: per-placeholder type checking and spec validation with anchored diagnostics on the {…} span via Position.Advance/WithRange.
arc/go/lsp/semantic.go expandRawStringPlaceholders delegates offset tracking to fmtstring.Parse and correctly advances LSP positions using the posOf monotonic cursor pattern.
x/go/diagnostics/diagnostics.go Adds Position.Advance (byte-walk column tracking) and Diagnostic.WithRange for per-placeholder span anchoring; byte-based advance is consistent with the rest of the diagnostics system.
console/src/arc/editor/placeholderSuggest.ts New file: workaround for monaco-vscode-api ignoring tokenTypes config; uses lookbehind regex to avoid false-positive triggers on { escapes.
arc/go/ir/string_fmt.go Minimal file adding StringFmtSyntheticPrefix constant for tagging synthetic IR functions.

Sequence Diagram

sequenceDiagram
    participant User as Arc Source
    participant Parser as Parser (STR_LITERAL_RAW)
    participant FmtPkg as fmtstring.Parse
    participant Analyzer as Analyzer
    participant Compiler as Compiler
    participant WASM as WASM Runtime
    participant Host as Host (string.format_*)

    User->>Parser: "backtick value is {sensor:.2f} backtick"
    Parser->>FmtPkg: Parse(body)
    FmtPkg-->>Analyzer: "[]Segment{literal, placeholder{expr,spec}}"
    Analyzer->>Analyzer: type-check expr, ValidateSpec(spec, t)
    Analyzer-->>Compiler: IR (inline or synthetic fmt$ function)
    Compiler->>WASM: "EmitFmtSegments → from_*/format_* calls"
    WASM->>Host: format_f32(value, spec_ptr, spec_len)
    Host->>Host: memory.Read(spec_ptr, spec_len) → %.2f
    Host->>Host: fmt.Sprintf(%.2f, value)
    Host-->>WASM: string handle
    WASM->>WASM: string.concat(literal_handle, fmt_handle)
    WASM-->>User: value is 3.14
Loading

Comments Outside Diff (1)

  1. arc/go/fmtstring/fmtstring.go, line 161-175 (link)

    P2 ValidateNumericSpec accepts non-numeric verbs like %T and %v

    ValidateNumericSpec validates by calling fmt.Sprintf and checking for %! in the output. However, verbs like %T (type name) and %v (generic) also pass this check for integer/float types — fmt.Sprintf("%T", int64(0)) returns "int64", not "%!T(int64=0)". A user writing {x%T} would get the runtime type string "int32" instead of the numeric value, which is almost certainly unintended. The validator should be restricted to explicitly allow only numeric verbs (e.g., d, f, e, g, x, o, b, s on strings) rather than relying on the absence of %!.

Reviews (7): Last reviewed commit: "SY-4159: Misc updates" | Re-trigger Greptile

@codecov
Copy link
Copy Markdown

codecov Bot commented May 9, 2026

Codecov Report

❌ Patch coverage is 89.41685% with 49 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.22%. Comparing base (1214e6c) to head (db1e77a).
⚠️ Report is 4 commits behind head on rc.

Files with missing lines Patch % Lines
arc/go/compiler/expression/string_fmt.go 73.91% 8 Missing and 4 partials ⚠️
console/src/arc/editor/placeholderSuggest.ts 35.71% 6 Missing and 3 partials ⚠️
arc/go/analyzer/flow/expression.go 63.15% 5 Missing and 2 partials ⚠️
arc/go/lsp/semantic.go 91.89% 3 Missing and 3 partials ⚠️
arc/go/compiler/string_fmt.go 73.33% 2 Missing and 2 partials ⚠️
arc/go/text/analyze.go 90.00% 2 Missing and 1 partial ⚠️
arc/go/compiler/compiler.go 66.66% 1 Missing and 1 partial ⚠️
arc/go/compiler/expression/literal.go 84.61% 1 Missing and 1 partial ⚠️
console/src/code/Editor.tsx 0.00% 1 Missing and 1 partial ⚠️
console/src/arc/editor/text/Editor.tsx 0.00% 1 Missing ⚠️
... and 1 more
Additional details and impacted files
@@            Coverage Diff             @@
##               rc    #2310      +/-   ##
==========================================
+ Coverage   64.97%   65.22%   +0.25%     
==========================================
  Files        2603     2608       +5     
  Lines      113946   114416     +470     
  Branches     8399     8410      +11     
==========================================
+ Hits        74035    74629     +594     
+ Misses      33774    33655     -119     
+ Partials     6137     6132       -5     
Flag Coverage Δ
alamos-go 55.25% <ø> (ø)
arc-go 77.71% <91.70%> (+0.43%) ⬆️
aspen 67.77% <ø> (-0.12%) ⬇️
cesium 82.39% <ø> (ø)
client-py 85.86% <ø> (ø)
client-ts 90.41% <ø> (ø)
console 23.09% <27.77%> (+1.32%) ⬆️
core 68.01% <ø> (ø)
freighter-go 66.74% <ø> (+0.07%) ⬆️
freighter-integration 1.48% <ø> (ø)
freighter-py 79.96% <ø> (ø)
freighter-ts 73.87% <ø> (ø)
oracle 62.63% <ø> (ø)
pluto 58.64% <ø> (+0.03%) ⬆️
x-go 81.70% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread console/src/code/placeholderSuggest.ts Outdated
Comment thread arc/go/fmtstring/fmtstring.go
@emilbon99 emilbon99 requested review from pjdotson and removed request for emilbon99 May 9, 2026 18:34
@emilbon99
Copy link
Copy Markdown
Contributor

@pjdotson is a better reviewer for this pr :)

- placeholderSuggest: PLACEHOLDER_RE skips escaped `\{` so autocomplete no longer fires on escape sequences
- fmtstring: rename ValidateNumericSpec → ValidateSpec; accept string, constant, and KindVariable types (recurses into Constraint)
- stl/strings: add format_string host fn + symbol entry so string placeholders with specs route through fmt.Sprintf
- compiler: EmitStringFormat + extend numericSuffix to cover constant kinds, so bare {123%T} and {3.14%.2f} compile
- analyzer: drop blanket rejection of specs on string-typed placeholders
Copy link
Copy Markdown
Contributor

@pjdotson pjdotson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good. I haven't looked through the Go code yet, wanted to get general functionality stuff back to you incase you have to change a lot. A couple issues:

LSP color formatting is a little weird. If i mistype something like this, .2 gets colored in red. Shouldn't this instead be a compiler error?

time.wait {
    0s
} -> `Price is ${float+float2%f.2}` -> arc_string_test

See https://linear.app/synnax/issue/SY-4171/fix-arc-node-update-bug, I don't know if this was a preexisting issue or an issue introduced here.

Comment thread console/src/code/Editor.tsx Outdated
Comment thread console/src/code/placeholderSuggest.ts Outdated
Comment thread docs/site/src/pages/reference/control/arc/how-to/data-processing.mdx Outdated

| Spec | Types | Use | Example |
| ------- | -------- | ----------------- | ----------------------- |
| `%d` | int | Decimal | `42` → `42` |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wht about unsigned decimal?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%d works for both. The int label in the Types column covers signed (i8 to i64) and unsigned (u8 to u64). Go's fmt formats both as plain decimal under %d, so there's no separate %u verb.

Comment thread docs/site/src/pages/reference/control/arc/reference/syntax.mdx Outdated
Comment thread docs/site/src/pages/reference/control/arc/reference/syntax.mdx Outdated
| `%X` | int, str | Hex, uppercase | `255` → `FF` |
| `%c` | int | Unicode character | `65` → `A` |
| `%U` | int | Unicode escape | `65` → `U+0041` |
| `%f` | float | Decimal | `3.14` → `3.140000` |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't i format integers with %f?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as above: format specs pass straight through to Go's fmt.Sprintf, and Go's fmt only accepts %f for floats. Calling fmt.Sprintf("%f", int64(0)) returns "%!f(int64=0)", which our validator catches at compile time.

Comment thread docs/site/src/pages/reference/control/arc/reference/syntax.mdx Outdated
Comment thread docs/site/src/pages/reference/control/arc/reference/syntax.mdx Outdated
Comment thread console/src/code/placeholderSuggest.spec.ts Outdated
@sy-nico
Copy link
Copy Markdown
Contributor Author

sy-nico commented May 12, 2026

@pjdotson ,

  1. Implemented a regex spec "shape" is valid.

  2. This is pre-existing (happens with numerics too). This was intentional in the original design, though, I agree it's something that we should change. I left a comment in the ticket explaining more.

- Change verb delimeter from `%` to `:`
- Allow for unescaped `}` to be `}` and simplify bracket placeholder detection logic
- Add verbs to blacklist
- Expand on tests
- Update documentation
- Fix auto suggest inside literal placeholders appearing with "No suggestions" bug
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants