Skip to content

[WIP] feat: support dynamic namespace discovery#2864

Open
Serializator wants to merge 19 commits intokubernetes:mainfrom
Serializator:ISSUE-2820-dynamic-namespaces
Open

[WIP] feat: support dynamic namespace discovery#2864
Serializator wants to merge 19 commits intokubernetes:mainfrom
Serializator:ISSUE-2820-dynamic-namespaces

Conversation

@Serializator
Copy link
Copy Markdown

What this PR does / why we need it:
Add support for dynamic namespace discovery. This allows the user to filter namespaces by label, providing more dynamic and granular control over what namespaces to watch.

This is a work in progress but I'd really appreciate early feedback.

The most impactful change is actually the refactoring of the metrics store to filter objects by their namespace at the event handler level instead of starting a dedicated reflector per namespace.

I'd love feedback on this approach. Performance-wise this reduces the number of events being handled concurrently at any point in time as there is one goroutine per resource instead of per resource and namespace.

The counter-part is a goroutine per resource and namespace. With dynamic namespace discovery instead of a static list, there will be rate limiting at the API level due to the design of one reflector (list & watch) per namespace.

How does this change affect the cardinality of KSM:
No change.

Which issue(s) this PR fixes:
Fixes #2820

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 1, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Serializator
Once this PR has been reviewed and has the lgtm label, please assign mrueg for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 1, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

This issue is currently awaiting triage.

If kube-state-metrics contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Feb 1, 2026
@github-project-automation github-project-automation bot moved this to Needs Triage in SIG Instrumentation Feb 1, 2026
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Feb 1, 2026
@Serializator Serializator changed the title [WIP] Issue 2820 dynamic namespaces [WIP] Support dynamic namespace discovery Feb 1, 2026
@Serializator Serializator force-pushed the ISSUE-2820-dynamic-namespaces branch from 2972878 to bac4618 Compare February 2, 2026 17:50
@Serializator Serializator changed the title [WIP] Support dynamic namespace discovery [WIP] feat: support dynamic namespace discovery Feb 2, 2026
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 10, 2026
@Serializator
Copy link
Copy Markdown
Author

@mrueg what do you think of these proposed changes?

@Serializator
Copy link
Copy Markdown
Author

Hey @mrueg! Just a quick mention in case it was overlooked. If you rather have me bother someone else with this, let me know!

@bhope
Copy link
Copy Markdown
Member

bhope commented Apr 15, 2026

@Serializator Thanks for the PR. Do you mind rebasing it, we can do a review then.

@Serializator
Copy link
Copy Markdown
Author

@bhope no problem! I'll rebase this evening. It's 10:04 AM (UTC+2) for me right now. Just as an indicator 😉

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds dynamic namespace discovery (via namespace label selectors) and refactors store construction to avoid per-namespace reflectors by filtering objects at the store/event-handler level.

Changes:

  • Add --namespace-label-selector option and document it.
  • Refactor store building to use a single cluster-wide reflector per resource, with namespace-based filtering in MetricsStore.
  • Introduce namespace discovery machinery (and tests) to dynamically update the active namespace set.

Reviewed changes

Copilot reviewed 9 out of 12 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
pkg/options/options.go Adds a new options field and CLI flag wiring for namespace label selectors.
docs/developer/cli-arguments.md Documents the new --namespace-label-selector flag.
pkg/app/server.go Adds configureNamespaceDiscovery and wires namespace discovery + rebuild loop into startup.
pkg/app/server_test.go Adds tests for configureNamespaceDiscovery behavior.
pkg/builder/types/interfaces.go Refactors builder interface to embed a NamespaceContainer capability.
internal/store/builder.go Switches to a single NamespaceAll ListWatch per resource and applies namespace filtering via store opts.
pkg/metrics_store/metrics_store.go Adds predicate support and a namespace predicate option to filter events at store level.
internal/discovery/ns_discovery.go Adds a namespace informer-based discoverer and a polling API for namespace-set changes.
internal/discovery/ns_discovery_test.go Adds tests for namespace discoverer behavior and concurrency.
internal/discovery/cr_types.go Adds CR discoverer types used by CRD discovery.
internal/discovery/cr_discovery.go Adds CRD discovery + cache polling logic (used by server startup).
internal/discovery/cr_discovery_test.go Adds tests for CR discoverer GVK resolution logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/metrics_store/metrics_store.go
Comment thread internal/store/builder.go Outdated
Comment thread pkg/app/server.go Outdated
Comment thread pkg/app/server.go Outdated
Comment thread pkg/app/server_test.go
Comment thread internal/discovery/ns_discovery_test.go
Comment thread pkg/app/server.go
Comment thread internal/store/builder.go
Comment thread pkg/app/server.go
Comment thread internal/discovery/ns_discovery.go
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla bot commented Apr 15, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 15, 2026
@Serializator Serializator force-pushed the ISSUE-2820-dynamic-namespaces branch from 0fb5f3c to 2b2c266 Compare April 15, 2026 20:31
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 15, 2026
@Serializator
Copy link
Copy Markdown
Author

@bhope I rebased. Unfortunately internal/discovery/*_test.go are running indefinitely for some reason. They were running fine before. I pushed so at least the code can be reviewed with the latest changes from main incorporated.

@Serializator Serializator force-pushed the ISSUE-2820-dynamic-namespaces branch from 2b2c266 to bec086c Compare April 16, 2026 22:42
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Apr 16, 2026
Wrap ListWatch with ToListWatcherWithWatchListSemantics to allow
the reflector to detect that fake clients don't support WatchList
semantics. This fixes test hangs introduced with client-go v0.35+.

Related: commit b871c40 fixed the same issue for the store builder.
@Serializator Serializator force-pushed the ISSUE-2820-dynamic-namespaces branch from bec086c to 709c089 Compare April 16, 2026 22:45
Copy link
Copy Markdown
Member

@bhope bhope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR draft and chalking out the core concept. The NamespaceDiscoverer and single reflector approach for dynamic discovery seem reasonable to me and the test coverage is good too.

However, the refactoring that is done for the buildstores (I get the reasoning is to simplify for cluster wide scope) is creating a security regression for multi-tenant clusters. It would require ClusterRoleBinding for all users, breaking the existing pattern where KSM can be scoped to specific namespaces via RoleBindings.

I would suggest to keep the original per-namespace reflectors intact while adding your cluster_wide reflectors and the new code to support namespace-label-selector set separately. This keeps the RBAC footprint unchanged for existing users while enabling the new dynamic discovery feature for those who opt in.

Also, once you have made the changes, try building locally to fix any compile errors (I believe there are some at the moment).

@github-project-automation github-project-automation bot moved this from Needs Triage to In Progress in SIG Instrumentation Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

Dynamic namespace filtering

4 participants