Skip to content

SY-4108: Improve Arc Linting#2258

Open
emilbon99 wants to merge 8 commits intorcfrom
sy-4108-improve-arc-linting
Open

SY-4108: Improve Arc Linting#2258
emilbon99 wants to merge 8 commits intorcfrom
sy-4108-improve-arc-linting

Conversation

@emilbon99
Copy link
Copy Markdown
Contributor

@emilbon99 emilbon99 commented Apr 23, 2026

Issue Pull Request

Linear Issue

SY-4108

Description

Adds a new category of Arc compiler diagnostics (ARC51xx, ARC52xx) for dead code — declarations that are never referenced and control-flow paths that can never execute. Six rules ship in this tier:

  • ARC5101 — unused variable (locals, stateful variables, channel aliases).
  • ARC5102 — uncalled function, reachability-aware: a function is flagged when no call site lives in code that can actually execute.
  • ARC5103 — unused global constant.
  • ARC5201 — unreachable stage (a stage with no incoming => transition inside a reachable sequence).
  • ARC5202 — unstarted sequence (a sequence never activated by any reachable top-level =>).
  • ARC5203 — unreachable code after an always-returning statement inside a function body.

All six are warnings. Suppression is the universal leading-underscore convention (_helper, _x). Declarations whose type failed to resolve are skipped to avoid piling warnings onto unrelated errors.

The reachability rules (ARC5102, ARC5201, ARC5202) share a single forward BFS over a combined activation graph with three node kinds (sequence, stage, function) and a virtual root representing module-scope code. This collapses what started as two separate analyses into one pass.

Implementation lives in a new arc/go/analyzer/unused package, split by concern:

  • unused.go — entry point
  • declarations.go — ARC5101, ARC5103
  • unreachable_code.go — ARC5203
  • reachability.go — ARC5102, ARC5201, ARC5202

Use-site tracking is wired through Context.ReferencedSymbols (a pointer set on the analysis context, matching the existing CallEdges side-table pattern), so symbol.Scope does not grow analyzer-specific state. Existing error-assertion tests across analyzer sub-packages were updated to filter on Diagnostics.Errors() where they were conflating "error present" with "total diagnostic count."

Basic Readiness

  • I have performed a self-review of my code.
  • I have added relevant, automated tests to cover the changes.
  • I have updated documentation to reflect the changes.

Greptile Summary

This PR introduces a new unused analyzer package that adds six dead-code warnings (ARC5101/5102/5103/5201/5202/5203) to the Arc compiler. A single forward BFS over a combined activation graph (sequences, stages, functions) covers reachability, while a recursive scope walk handles unused declarations, and a per-block scan handles unreachable code after always-returning statements. Existing tests across analyzer sub-packages are updated to filter on Diagnostics.Errors() so the new warning diagnostics don't break error-count assertions.

Confidence Score: 5/5

Safe to merge; all findings are P2 edge-case concerns that do not affect the common path.

The implementation is well-structured with a clear separation of concerns across three passes, thorough positive/negative test coverage for all six rules, and proper cascade-suppression logic. All remaining findings are speculative edge cases or documented intentional limitations. No correctness bugs were found on the common path.

arc/go/analyzer/unused/reachability.go — the walkFlowOutputs GetChildren() loop and the => next on the last stage are worth a second look if the grammar evolves to support new flow constructs.

Important Files Changed

Filename Overview
arc/go/analyzer/unused/reachability.go New BFS-based reachability analysis for ARC5102/5201/5202; has a potential gap with the default: transition = false reset in walkFlowOutputs and a silent no-op for => next on the last stage.
arc/go/analyzer/unused/declarations.go ARC5101/5103 unused-declaration pass; recursive walkScope may cascade into unreachable stage scopes if those can contain variable declarations.
arc/go/analyzer/unused/unreachable_code.go ARC5203 unreachable-code pass; clean and correct — emits at most one warning per block to avoid cascades.
arc/go/analyzer/unused/unused.go Entry point that sequences the three passes; straightforward and correct.
arc/go/analyzer/context/context.go Adds ReferencedSymbols *set.Set[*symbol.Scope] field and MarkReferenced helper; correctly initialized in CreateRoot and propagated through Child.
arc/go/analyzer/codes/codes.go Adds six new ARC5xxx warning codes with clear doc comments; no issues.
arc/go/analyzer/expression/expression.go Single-line addition of ctx.MarkReferenced(resolved) in analyzePrimary to record identifier use-sites; correctly placed after successful resolution.
arc/go/analyzer/statement/statement.go Adds MarkReferenced for assignment LHS — intentionally marks write-only variables as referenced (documented limitation).
arc/go/analyzer/analyzer_test.go Updates existing tests to filter on Diagnostics.Errors() so new warning diagnostics don't break existing error-count assertions.
arc/go/analyzer/unused/unused_test.go Comprehensive test suite for all six new rules with positive, negative, cascade-suppression, and underscore-suppression cases.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    AP["AnalyzeProgram\n(main pass)"] --> UA["unused.Analyze"]
    UA --> AD["analyzeDeclarations\nARC5101, ARC5103"]
    UA --> AUN["analyzeUnreachable\nARC5203"]
    UA --> AR["analyzeReachability\nARC5102, ARC5201, ARC5202"]

    AD --> WS["walkScope\n(recursive)"]
    WS --> UD{"unusedDeclarationDiagnostic"}
    UD -->|KindGlobalConstant| C5103["ARC5103"]
    UD -->|KindVariable / KindStatefulVariable| C5101["ARC5101"]

    AUN --> CB["checkBlock\n(per function body)"]
    CB --> SAE["statementAlwaysExits?"]
    SAE -->|yes| C5203["ARC5203"]

    AR --> CRE["collectReachabilityEdges\n(CallEdges + AST walk)"]
    CRE --> BFS["reachableFromRoot\n(BFS from virtual root nil)"]
    BFS --> ERD["emitReachabilityDiagnostics"]
    ERD -->|KindFunction unreached| C5102["ARC5102"]
    ERD -->|KindSequence unreached| C5202["ARC5202"]
    ERD -->|KindStage unreached in reached seq| C5201["ARC5201"]
Loading

Reviews (1): Last reviewed commit: "tighten unused package: side-table + fil..." | Re-trigger Greptile

Greptile also left 4 inline comments on this PR.

Emits a warning for local variables, stateful variables, and channel
aliases that are declared but never referenced. Leading underscore
suppresses the warning (e.g., _x := 42). Declarations with an invalid
type (poisoned by an earlier analysis error) are skipped to avoid
piling on a secondary warning.

Updates existing tests in two idioms going forward:
- tests asserting error content filter on Diagnostics.Errors() so they
  are not coupled to unrelated warnings.
- test fixtures with incidental placeholder bindings use the _ prefix
  or drop the binding entirely.
Flags top-level functions declared but never invoked from any call site
(flow statement, stage body, routing table, or function-body expression).
Marks functions Referenced at flow.resolveFunc and expression-level call
sites; the unused pass walks KindFunction scopes with the same
leading-underscore suppression and poisoned-type skip as ARC5101.

Composition with sequence/stage reachability (for "called only from dead
code") is deferred until ARC5201/ARC5202 ship.

Test-helper cleanup: analyzeExpectSuccess and analyzeExpectError in
function_test.go now assert on Diagnostics.Errors() so they are not
coupled to warnings from orthogonal analysis passes. Incidental test
fixtures that declared placeholder functions now use the _ prefix.
Flags top-level constants declared but never referenced from any
expression. The existing analyzePrimary instrumentation already marks
constant scopes as Referenced at use-sites, so the walker only needs a
new KindGlobalConstant case. Leading-underscore suppression and the
poisoned-type skip carry over from ARC5101/5102.

constant_test.go helpers now filter on Diagnostics.Errors() so the
constant analyzer's tests are not coupled to warnings from orthogonal
passes.
Flags statements in function bodies that are preceded by an earlier
statement that always returns (a bare return, or an if/else where every
branch always returns). Reuses function.BlockAlwaysReturns and
IfStmtAlwaysReturns for the always-returns check. Recurses into nested
if / else-if / else blocks and stops at one warning per block to avoid
piling up on long dead tails.

Scope is function bodies only: stage bodies are reactive parallel
flows with no notion of statement ordering, so "after" does not apply.
Walks the program AST to build a directed activation graph where the
virtual root points at every sequence that appears as the target of a
module-scope `=>` flow. Entering a sequence is modeled as an implicit
edge to its entry stage (first in source order); stage bodies add
edges for intra-sequence `=>`, `=> next`, and cross-sequence activation.

A BFS from the virtual root yields the set of reachable sequences and
stages. A declared sequence outside that set is ARC5202 (unstarted).
A stage inside a reachable sequence that is not itself reached is
ARC5201 (unreachable stage); stage diagnostics are suppressed when the
containing sequence is already flagged as unstarted so users get one
diagnostic per declaration rather than a cascade. Leading-underscore
suppression carries over from the ARC51xx rules.

Composition with ARC5102: a function called only from an unreachable
stage or unstarted sequence is still considered referenced today.
Refining that will layer on top of this reachability pass.
A function is now flagged uncalled whenever no call site for it lives
inside code that can actually execute. Call sites are collected from
three contexts: top-level flow statements (always live), stage bodies
(live iff the stage is reachable via the sequence activation graph),
and function bodies (live iff the enclosing function is itself live).
Liveness is computed as a fixpoint over the call graph.

analyzeSequenceReachability now returns the reachable set for
downstream passes. The declaration walker no longer emits ARC5102;
emission moves to the new analyzeFunctionReachability pass which runs
after sequence reachability is known.

Added LSP test helper errorDiags for severity filtering, used in the
three fixture tests whose placeholder function now has no live caller
and would otherwise double-count R5102 warnings.
Replaces the separate sequence-reachability BFS and function-liveness
fixpoint with a single forward BFS over a combined activation graph.
Node kinds are sequence, stage, and function scopes, with nil as a
virtual root for module-scope code. Edges capture:

  nil      -> sequence / function   (top-level flow)
  sequence -> entry stage           (implicit)
  stage    -> stage / sequence      (transitions)
  stage    -> function              (flow-level invocation)
  function -> function              (ctx.CallEdges)

Any declared scope not visited by the BFS produces its respective
diagnostic (uncalled function, unreachable stage, unstarted sequence).
Removes the callerRef / callerKind enum, the callerIsLive helper, the
computeLiveFunctions fixpoint, and the two separate emit paths in
favor of a single emitter.

Net change: 216 production lines removed; tests unchanged.
Replace symbol.Scope.Referenced with context.Context.ReferencedSymbols
(a pointer to set.Set[*symbol.Scope]), matching the side-table pattern
already used for CallEdges. Mutation goes through a nil-safe
Context.MarkReferenced helper so sub-pass tests that hand-build a
Context without wiring shared state do not panic. Drops the two dead
Referenced writes on function scopes left over from the R5102
unification.

Split unused.go into three focused files:
  unused.go             - package doc + Analyze entry point
  declarations.go       - ARC5101, ARC5103 (declaration walker)
  unreachable_code.go   - ARC5203 (function body walker)

reachability.go (ARC5102, ARC5201, ARC5202) is unchanged. Shared
diagnostic construction consolidated into a small newUnusedDiagnostic
helper. symbol.Scope no longer carries analyzer-specific state, and
three analyzer sub-packages stop mutating a bool on it.
Comment on lines +157 to +170
transition := false
for _, child := range flowStmt.GetChildren() {
switch x := child.(type) {
case parser.IFlowOperatorContext:
transition = x.TRANSITION() != nil
case parser.IFlowNodeContext:
if transition {
addTarget(resolveTransitionTarget(stdCtx, resolveScope, nextStage, x))
}
transition = false
default:
transition = false
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 default: transition = false may silently drop => targets

The GetChildren() loop sets transition = false in the default branch, which covers every child that is neither a FlowOperatorContext nor a FlowNodeContext. If the ANTLR grammar injects any terminal token (e.g., a comma separator, a routing-table node) between a => operator and its target FlowNode, the flag is cleared and the transition edge is never added to the graph — silently producing a false "unreachable stage" warning.

The routing-table iteration above already handles function edges from within tables, but => transitions inside routing-table entries (if the grammar allows them) are not handled at all; the IRoutingTableContext child would hit default and reset the flag before the target FlowNodeContext is visited.

Comment on lines +37 to +48
func walkScope(
scope *symbol.Scope,
referenced set.Set[*symbol.Scope],
diag *diagnostics.Diagnostics,
) {
for _, child := range scope.Children {
if d := unusedDeclarationDiagnostic(child, referenced); d != nil {
diag.Add(*d)
}
walkScope(child, referenced, diag)
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 ARC5101 may cascade onto variables inside unreachable stages

walkScope recurses unconditionally into every scope in the tree, including scopes that belong to stages already flagged as unreachable (ARC5201) or inside an unstarted sequence (ARC5202). If Arc stage bodies support local-variable-style declarations (e.g., channel aliases x := ch), a variable in such a stage body would both miss all MarkReferenced calls (since the stage is never executed) and trigger an ARC5101 "unused variable" warning on top of the existing ARC5201/ARC5202 warning — the same cascade that the emitReachabilityDiagnostics logic explicitly avoids for stages inside unstarted sequences.

If stage bodies can contain KindVariable children this is worth suppressing: skip walkScope recursion into stage scopes that are not in the reached set, or at minimum add a test that verifies no ARC5101 fires inside an unreachable stage.

Comment on lines +107 to +110
var nextStage *symbol.Scope
if i+1 < len(stageScopes) {
nextStage = stageScopes[i+1]
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 nextStage is nil for the last stage; => next there silently goes nowhere

When processing the last stage in a sequence, nextStage is nil. resolveTransitionTarget returns nil for => next when nextStage == nil, and addEdge then silently drops the edge. This is correct for reachability purposes (there is no next stage), but it means an => next in the final stage of a sequence doesn't contribute any reachable edge. Consider whether => next in the last stage should be a separate semantic error surfaced earlier in the pipeline, rather than silently becoming a no-op in the reachability graph.

ctx.Diagnostics.Add(diagnostics.Error(err, ctx.AST))
return
}
ctx.MarkReferenced(varScope)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Write-only assignments mark the LHS as referenced

ctx.MarkReferenced(varScope) is called here for every assignment, including simple stores (x = 2) where the new value may never be read. This means a variable that is only ever written to will not trigger ARC5101. The PR description and tests document this as intentional ("deferred to unused-assignment rule"), but it creates a gap versus what users typically expect from an "unused variable" diagnostic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant