fix: stop custom resource reflectors on context cancellation#2883
fix: stop custom resource reflectors on context cancellation#2883colega wants to merge 2 commits intokubernetes:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: colega The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
This issue is currently awaiting triage. If kube-state-metrics contributors determine this is a relevant issue, they will accept it by applying the The DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Welcome @colega! |
Custom resource reflectors only listened to their per-GVK stop channel, ignoring context cancellation from BuildWriters. When BuildWriters rebuilt stores (e.g. after a CRD update event), it cancelled the context to stop old reflectors, but custom resource reflectors kept running. Each rebuild accumulated leaked Reflector.Run and StreamWatcher.receive goroutine pairs, causing unbounded memory, CPU, and network growth. Merge both stop signals so CR reflectors stop on either context cancellation or per-GVK stop channel close.
a09ff70 to
6863cff
Compare
|
I pushed a fix for the data race uncovered by the e2e tests, however I think the issue is bigger than just the data race: |
Builder now creates its own cancellable context in NewBuilder instead of accepting one via WithContext. WithContext subscribes to the provided context's cancellation rather than replacing b.ctx, which avoids a data race with concurrent startReflector reads and ensures already-running reflectors get notified on cancellation.
d9d8357 to
6efdcb4
Compare
|
I pushed another fix to fix that: by the time when |
|
I have built an image with this and will run it over the weekend on our dev infrastructure to verify that the bug is fixed. |
|
I've tested this fix and it shows no leaks. |
|
Thanks for your contribution! I had a similar fix in #2870 but yours looks more complete. |
There was a problem hiding this comment.
Pull request overview
This PR fixes leaked goroutines for custom resource reflectors by ensuring they stop when either the per-GVK stop channel closes or the BuildWriters context is canceled (e.g., during store rebuilds after CRD updates).
Changes:
- Merge stop signals for custom-resource reflectors so they stop on either context cancellation or per-GVK stop.
- Initialize
Builderwith an internal cancelable context. - Change
WithContextbehavior to observe cancellation instead of replacing the builder context.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // WithContext will observe the cancellations of the provided context | ||
| // to cancel the internal b.ctx. | ||
| func (b *Builder) WithContext(ctx context.Context) { | ||
| b.ctx = ctx | ||
| // WithContext might be called concurrently with startReflector. | ||
| // In order to avoid the data race, and also to notify the reflectors that | ||
| // are already running with the context, we don't replace b.ctx, | ||
| // but just subscribe to the cancellations of the provided context. | ||
| go func() { | ||
| <-ctx.Done() | ||
| b.cancel() | ||
| }() | ||
| } |
There was a problem hiding this comment.
WithContext now cancels a single internal context (created in NewBuilder) when the provided ctx is canceled, but that internal context is never replaced/reset. Since MetricsHandler.BuildWriters cancels the previous ctx on every rebuild, the second BuildWriters call will cancel b.ctx permanently, causing all subsequently started reflectors (including non-CR ones using b.ctx.Done()) to stop immediately and the cache to stop updating. To preserve rebuild behavior, WithContext should continue to replace the builder context (or atomically swap to a new derived context/cancel func) rather than subscribing old contexts to cancel a long-lived internal context.
| stopCh := make(chan struct{}) | ||
| crStopCh := (*b.GVKToReflectorStopChanMap)[cr.GroupVersionKind().String()] | ||
| go func(builderContext context.Context) { | ||
| select { | ||
| case <-builderContext.Done(): | ||
| case <-crStopCh: | ||
| } | ||
| close(stopCh) | ||
| }(b.ctx) |
There was a problem hiding this comment.
The new stop-signal merge for custom resource reflectors is a concurrency/cancellation fix but isn't covered by tests in internal/store/builder_test.go. Adding a focused unit/integration test that exercises a CR reflector stopping on context cancellation (without closing the per-GVK stop channel) would help prevent regressions like goroutine leaks or reflectors not stopping on rebuild.
| stopCh := make(chan struct{}) | |
| crStopCh := (*b.GVKToReflectorStopChanMap)[cr.GroupVersionKind().String()] | |
| go func(builderContext context.Context) { | |
| select { | |
| case <-builderContext.Done(): | |
| case <-crStopCh: | |
| } | |
| close(stopCh) | |
| }(b.ctx) | |
| // For custom resources, prefer a per-GVK stop channel, but fall back to the | |
| // builder context if the map or entry is not available. | |
| if b.GVKToReflectorStopChanMap == nil { | |
| go reflector.Run(b.ctx.Done()) | |
| return | |
| } | |
| gvk := cr.GroupVersionKind().String() | |
| crStopCh, found := (*b.GVKToReflectorStopChanMap)[gvk] | |
| if !found || crStopCh == nil { | |
| go reflector.Run(b.ctx.Done()) | |
| return | |
| } | |
| stopCh := make(chan struct{}) | |
| go func(builderContext context.Context, crStopCh <-chan struct{}, stopCh chan struct{}) { | |
| defer close(stopCh) | |
| select { | |
| case <-builderContext.Done(): | |
| case <-crStopCh: | |
| } | |
| }(b.ctx, crStopCh, stopCh) |
|
There's no way I can keep just 6863cff: it adds a race condition. Builder's I'm not sure what's the expected behavior for I don't think I'll have time to dig into this in the short-term future, maybe you should consider reverting the piece of code that introduced this leak in the first place. |
|
No worries @colega - Thanks for your time on this so far. We are continuing to fix this via a different PR. |
Summary
Fixes #2867
Custom resource reflectors only listened to their per-GVK stop channel, ignoring context cancellation from
BuildWriters. WhenBuildWritersrebuilt stores after a CRD update event, it cancelled the context to stop old reflectors, but custom resource reflectors kept running since they were waiting on a different stop channel. Each rebuild accumulated leakedReflector.RunandStreamWatcher.receivegoroutine pairs, causing unbounded memory, CPU, and network growth.This merges both stop signals so CR reflectors stop on either context cancellation or per-GVK stop channel close.
I found one caveat however: the
Builder's context is replaced after we start the resource reflectors, so I changed howWithContextbehaves: instead of replacing the context, we subscribe to that context's cancellations. I think this can be achieved in some more cleaner way, for example removingWithContextand exporting aStop()method on builder's interface, but I'm not familiar with the codebase and I don't know whether this exported contract can be modified.