Skip to content

SY-3880: Arc String Conversion Support#2298

Merged
sy-nico merged 7 commits into
rcfrom
sy-3880-arc-string-conversion
May 8, 2026
Merged

SY-3880: Arc String Conversion Support#2298
sy-nico merged 7 commits into
rcfrom
sy-3880-arc-string-conversion

Conversation

@sy-nico
Copy link
Copy Markdown
Contributor

@sy-nico sy-nico commented May 6, 2026

Issue Pull Request

Linear Issue

SY-3880

Description

Adds support for converting numeric primitives (i8-i64, u8-u64, f32, f64) to strings via Arc's existing typecast syntax (e.g., str(42), str(3.14)). Floats use shortest round-trippable formatting matching Go's strconv.FormatFloat(_, 'g', -1, bitSize); the reverse direction (str to numeric) remains rejected.

str() also works as a flow stage (channel -> str -> log). Both Go and C++ WASM runtimes now materialize string-typed outputs into a fresh series at the end of each tick rather than attempting to resize variable-density buffers in place, which previously caused a runtime trap.

Conversion routing lives in a single Resolver.EmitNumericToString dispatcher so the planned f-string feature can reuse it without duplicating the source-type to host-fn switch.

Basic Readiness

  • I have performed a self-review of my code.
  • I have added relevant, automated tests to cover the changes.
  • I have updated documentation to reflect the changes.

Greptile Summary

This PR adds str() typecast support that converts numeric primitives (i8–i64, u8–u64, f32, f64) to their decimal string representation, and fixes a runtime trap that occurred when attempting to resize variable-density string buffers in place during WASM tick evaluation.

  • Conversion routing is centralised in Resolver.EmitNumericToString, dispatching to one of six new string.from_* host functions (backed by strconv.FormatFloat with the correct bitSize on the Go side and std::to_chars(…, std::chars_format::general) on the C++ side). Float formatting matches Go's strconv.FormatFloat(_, 'g', -1, bitSize) contract.
  • Both runtimes (Go WASM and C++ WASM) now accumulate string outputs into a temporary slice per tick and materialise a fresh series at the end, instead of calling Resize on a variable-density buffer (which would panic or trap). The str() cast also works as an inline flow stage (channel -> str -> log).

Confidence Score: 5/5

The change is safe to merge. The numeric-to-string conversion path is well-isolated, the variable-density buffer fix correctly avoids the Resize trap in both runtimes, and test coverage spans unit, integration, and end-to-end layers across all numeric source types.

The core logic — accumulating string handles into a temporary slice per tick and materialising a fresh series at the end — is symmetric between Go and C++ runtimes and verified by multi-sample tests. The Go Data-only assignment is correct because telem.Series.Len() for variable-density types always recomputes from Data (Resize panics on variable-density, so cachedLength is never set persistently for string series). Float formatting uses the right bitSize (32 vs 64) to match Go's strconv contract. No logic errors or unsafe paths were found.

No files require special attention.

Important Files Changed

Filename Overview
arc/go/compiler/resolve/emit.go New EmitNumericToString and EmitFixedCall functions; type-switch routing from all 10 numeric kinds to the correct from_* host fn looks correct.
arc/go/stl/strings/string.go Six new host functions added; FormatFloat bitSize is correctly 32 for f32 and 64 for f64. SymbolResolver entries match host function signatures.
arc/go/stl/wasm/node.go String outputs now accumulate into a []string slice; only out.Data is replaced at tick end. Verified safe: telem.Series.Len() for variable-density types recomputes directly from Data (cachedLength is never set persistently via Resize since Resize panics on variable-density). DPanicf on unregistered handle miss is a good defensive addition.
arc/cpp/stl/str/str.h format_float correctly handles NaN/±Inf and now checks the to_chars error code (returning empty string on failure). Six new from_* linker bindings added.
arc/cpp/runtime/wasm/node.h String outputs accumulate into a lazy-allocated vector-of-vectors and are materialised as a full Series assignment at tick end, correctly bypassing the Resize-on-variable-density trap.
arc/go/compiler/expression/cast.go Hint suppression for str() targets prevents numeric literals from inheriting the string type; extractType extended to recognise STR primitive; EmitCast short-circuits to EmitNumericToString for string targets.
arc/go/str_cast_test.go Comprehensive end-to-end tests covering literals, channel reads, flow-stage usage, and concat patterns across all numeric types. Ghost-precision f32 cases included.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["str(x) in Arc source"] --> B{Analyzer isValidCast}
    B -->|"numeric source"| C["Compiler: compileTypeCast\nhint suppressed for str target"]
    B -->|"str source"| D["Rejected: cannot cast"]
    C --> E["EmitCast → EmitNumericToString"]
    E --> F{source kind}
    F -->|"i8/i16/i32"| G["string.from_i32"]
    F -->|"u8/u16/u32"| H["string.from_u32"]
    F -->|"i64"| I["string.from_i64"]
    F -->|"u64"| J["string.from_u64"]
    F -->|"f32"| K["string.from_f32"]
    F -->|"f64"| L["string.from_f64"]
    G & H & I & J --> M["Host fn: FormatInt/Uint → string handle uint32"]
    K & L --> N["Host fn: FormatFloat 'g' -1 bitSize → string handle uint32"]
    M & N --> O["WASM stack: i32 handle"]
    O --> P{Runtime: string output?}
    P -->|yes| Q["Accumulate into stringResults per tick"]
    P -->|no| R["setValueAt numeric output"]
    Q --> S["End of tick: materialise fresh Series from accumulated strings"]
    R --> T["Resize to sample count"]
Loading

Reviews (4): Last reviewed commit: "SY-3380: Implement feedback" | Re-trigger Greptile

sy-nico added 3 commits May 6, 2026 16:33
Allows numeric primitives (i8-i64, u8-u64, f32, f64) to convert to str via the existing typecast syntax; floats use shortest round-trippable formatting. Reverse direction (str -> numeric) remains rejected. Routing lives in a single Resolver.EmitNumericToString dispatcher so the planned f-string feature can reuse it.
@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

❌ Patch coverage is 87.36842% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.78%. Comparing base (7b33bce) to head (f739706).
⚠️ Report is 8 commits behind head on rc.

Files with missing lines Patch % Lines
arc/go/stl/wasm/node.go 62.96% 9 Missing and 1 partial ⚠️
arc/go/compiler/resolve/emit.go 92.85% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##               rc    #2298      +/-   ##
==========================================
+ Coverage   64.75%   64.78%   +0.03%     
==========================================
  Files        2592     2596       +4     
  Lines      112597   113543     +946     
  Branches     8346     8396      +50     
==========================================
+ Hits        72911    73562     +651     
- Misses      33565    33830     +265     
- Partials     6121     6151      +30     
Flag Coverage Δ
arc-go 77.17% <87.36%> (+0.21%) ⬆️
core 67.49% <ø> (-0.55%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread arc/cpp/stl/str/str.h
Comment thread arc/go/stl/wasm/node.go Outdated
Copy link
Copy Markdown
Contributor

@emilbon99 emilbon99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good overall. Comments are mostly nitpicks.

Comment thread arc/cpp/runtime/wasm/node.h Outdated
const auto off = this->offsets[j];
this->state.output(j)->resize(off);
if (this->string_outputs[j]) {
*this->state.output(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatting here is weird. See if you can coerce the formatter to be nicer. Also you can get rid of the {} since there is a single expression.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made an attempt

Comment thread arc/go/compiler/resolve/emit.go Outdated
}

// EmitStringFromI32 emits a call to string.from_i32.
func (r *Resolver) EmitStringFromI32(w *wasm.Writer, wID int) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do all of these sub-functions need to be public? Is there anywhere we call anything other than EmitNumericToString?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collapsed them all into EmitNumericToString since nothing else called them.

Comment thread arc/go/compiler/resolve/emit.go Outdated
}

// EmitStringFromU32 emits a call to string.from_u32.
func (r *Resolver) EmitStringFromU32(w *wasm.Writer, wID int) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a lot of duplicated code across these functions. output type is always the same. ONly thing that varies is method name and input type. Could always pass the input type as a parameter then do fmt.Sprintf("string.from_%s", t). Would cut out vast majority of changes in this file.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Folded into a single EmitNumericToString with a switch picking the input type and suffix, then "string.from_" + suffix since it's marginally faster than fmt.Sprintf()

Outputs: types.Params{{Name: ir.DefaultOutputParam, Type: types.I64()}},
}),
},
"from_i32": {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of interesting that the type signatures are duplicated here and in the wasm writer. Nothing super wrong with it but I did notice it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added EmitFixedCall that looks up the signature from the SymbolResolver by name, so string.go is the sole source of truth now.

Comment thread arc/go/stl/wasm/node.go Outdated
Comment thread arc/go/stl/wasm/node.go Outdated
@sy-nico sy-nico merged commit 3c9f26a into rc May 8, 2026
37 checks passed
@sy-nico sy-nico deleted the sy-3880-arc-string-conversion branch May 8, 2026 03:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants