Skip to content

fix: capture Run() error in metrics controller goroutine#330

Merged
chrisliu1995 merged 1 commit into
openkruise:masterfrom
abhaygoudannavar:fix/stale-err-metrics-goroutine
Apr 23, 2026
Merged

fix: capture Run() error in metrics controller goroutine#330
chrisliu1995 merged 1 commit into
openkruise:masterfrom
abhaygoudannavar:fix/stale-err-metrics-goroutine

Conversation

@abhaygoudannavar
Copy link
Copy Markdown
Contributor

What this PR does

Fixes a closure bug where the metrics controller goroutine logged the wrong error on failure.

Problem

In main.go (lines 280-285), the goroutine that runs the metrics controller never captures the return value of metricsController.Run(). Instead, it references the outer err variable from the metrics.NewController() call:

metricsController, err := metrics.NewController(kruisegameInformerFactory)
if err != nil {
    setupLog.Error(err, "unable to create metrics controller")
    os.Exit(1)
}
kruisegameInformerFactory.Start(signal.Done())
go func() {
    if metricsController.Run(signal) != nil {
        setupLog.Error(err, "unable to setup metrics controller")  // wrong err
        os.Exit(1)
    }
}()

Since NewController succeeded (otherwise we'd have exited), err is nil by the time the goroutine runs. So if Run() fails, the log prints a nil error — making the failure impossible to diagnose.

Fix
Capture the return value of Run() into its own variable:

    if runErr := metricsController.Run(signal); runErr != nil {
        setupLog.Error(runErr, "unable to setup metrics controller")
        os.Exit(1)
    }
}()

fixes #325

The goroutine was referencing the outer 'err' variable from
metrics.NewController() instead of capturing the return value
from metricsController.Run(). By the time the goroutine executes,
the outer err is nil (NewController succeeded), so the log line
always printed a nil error — making failures impossible to diagnose.
Fixes openkruise#325
@kruise-bot kruise-bot requested review from FillZpp and furykerry April 13, 2026 05:55
@kruise-bot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign furykerry for approval by writing /assign @furykerry in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 40.76%. Comparing base (206e756) to head (a19f1df).
⚠️ Report is 3 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #330   +/-   ##
=======================================
  Coverage   40.76%   40.76%           
=======================================
  Files         112      112           
  Lines       12544    12544           
=======================================
  Hits         5114     5114           
  Misses       7019     7019           
  Partials      411      411           
Flag Coverage Δ
unittests 40.76% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@abhaygoudannavar
Copy link
Copy Markdown
Contributor Author

hey @furykerry can you check out on this pr?
if there are any changes required

@chrisliu1995 chrisliu1995 merged commit 35eec22 into openkruise:master Apr 23, 2026
8 of 9 checks passed
@abhaygoudannavar abhaygoudannavar deleted the fix/stale-err-metrics-goroutine branch April 23, 2026 09:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: metrics controller goroutine logs wrong error due to stale variable capture

3 participants