[WIP] feat: support dynamic namespace discovery#2864
[WIP] feat: support dynamic namespace discovery#2864Serializator wants to merge 19 commits intokubernetes:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Serializator The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
This issue is currently awaiting triage. If kube-state-metrics contributors determine this is a relevant issue, they will accept it by applying the The DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
2972878 to
bac4618
Compare
|
@mrueg what do you think of these proposed changes? |
|
Hey @mrueg! Just a quick mention in case it was overlooked. If you rather have me bother someone else with this, let me know! |
|
@Serializator Thanks for the PR. Do you mind rebasing it, we can do a review then. |
|
@bhope no problem! I'll rebase this evening. It's 10:04 AM (UTC+2) for me right now. Just as an indicator 😉 |
There was a problem hiding this comment.
Pull request overview
Adds dynamic namespace discovery (via namespace label selectors) and refactors store construction to avoid per-namespace reflectors by filtering objects at the store/event-handler level.
Changes:
- Add
--namespace-label-selectoroption and document it. - Refactor store building to use a single cluster-wide reflector per resource, with namespace-based filtering in
MetricsStore. - Introduce namespace discovery machinery (and tests) to dynamically update the active namespace set.
Reviewed changes
Copilot reviewed 9 out of 12 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/options/options.go | Adds a new options field and CLI flag wiring for namespace label selectors. |
| docs/developer/cli-arguments.md | Documents the new --namespace-label-selector flag. |
| pkg/app/server.go | Adds configureNamespaceDiscovery and wires namespace discovery + rebuild loop into startup. |
| pkg/app/server_test.go | Adds tests for configureNamespaceDiscovery behavior. |
| pkg/builder/types/interfaces.go | Refactors builder interface to embed a NamespaceContainer capability. |
| internal/store/builder.go | Switches to a single NamespaceAll ListWatch per resource and applies namespace filtering via store opts. |
| pkg/metrics_store/metrics_store.go | Adds predicate support and a namespace predicate option to filter events at store level. |
| internal/discovery/ns_discovery.go | Adds a namespace informer-based discoverer and a polling API for namespace-set changes. |
| internal/discovery/ns_discovery_test.go | Adds tests for namespace discoverer behavior and concurrency. |
| internal/discovery/cr_types.go | Adds CR discoverer types used by CRD discovery. |
| internal/discovery/cr_discovery.go | Adds CRD discovery + cache polling logic (used by server startup). |
| internal/discovery/cr_discovery_test.go | Adds tests for CR discoverer GVK resolution logic. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…espace for performance
0fb5f3c to
2b2c266
Compare
|
@bhope I rebased. Unfortunately |
2b2c266 to
bec086c
Compare
Wrap ListWatch with ToListWatcherWithWatchListSemantics to allow the reflector to detect that fake clients don't support WatchList semantics. This fixes test hangs introduced with client-go v0.35+. Related: commit b871c40 fixed the same issue for the store builder.
bec086c to
709c089
Compare
bhope
left a comment
There was a problem hiding this comment.
Thanks for the PR draft and chalking out the core concept. The NamespaceDiscoverer and single reflector approach for dynamic discovery seem reasonable to me and the test coverage is good too.
However, the refactoring that is done for the buildstores (I get the reasoning is to simplify for cluster wide scope) is creating a security regression for multi-tenant clusters. It would require ClusterRoleBinding for all users, breaking the existing pattern where KSM can be scoped to specific namespaces via RoleBindings.
I would suggest to keep the original per-namespace reflectors intact while adding your cluster_wide reflectors and the new code to support namespace-label-selector set separately. This keeps the RBAC footprint unchanged for existing users while enabling the new dynamic discovery feature for those who opt in.
Also, once you have made the changes, try building locally to fix any compile errors (I believe there are some at the moment).
What this PR does / why we need it:
Add support for dynamic namespace discovery. This allows the user to filter namespaces by label, providing more dynamic and granular control over what namespaces to watch.
This is a work in progress but I'd really appreciate early feedback.
The most impactful change is actually the refactoring of the metrics store to filter objects by their namespace at the event handler level instead of starting a dedicated reflector per namespace.
I'd love feedback on this approach. Performance-wise this reduces the number of events being handled concurrently at any point in time as there is one goroutine per resource instead of per resource and namespace.
The counter-part is a goroutine per resource and namespace. With dynamic namespace discovery instead of a static list, there will be rate limiting at the API level due to the design of one reflector (list & watch) per namespace.
How does this change affect the cardinality of KSM:
No change.
Which issue(s) this PR fixes:
Fixes #2820