feat: Introduce object limits#2626
Conversation
|
This issue is currently awaiting triage. If kube-state-metrics contributors determine this is a relevant issue, they will accept it by applying the The DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
9f17eee to
915ef4e
Compare
adee22a to
dcffe5f
Compare
e1c99cf to
b7f6f2b
Compare
| } | ||
| i.metrics.ListRequestsTotal.WithLabelValues("success", i.resource).Inc() | ||
|
|
||
| if i.limit != 0 { |
There was a problem hiding this comment.
We'd want to validate against negative values, also s/int/uint preferably. I'm reviewing this on my way back from work so apologies if I missed something obvious here.
There was a problem hiding this comment.
Good point, to be honest, I have no idea why the kubernetes API made limits a signed int. https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#ListOptions
00b2c6b to
93b4b9f
Compare
rexagod
left a comment
There was a problem hiding this comment.
Can we have an e2e test-case as well that covers --object-limit? Also one nit, otherwise LGTM.
| } | ||
|
|
||
| if o.ObjectLimit < 0 { | ||
| return fmt.Errorf("value for --object-limit=%d must be equal or greater than 0", o.ObjectLimit) |
There was a problem hiding this comment.
Do we want this to be greater than zero? I see in a couple of instances in watch.go that we check for this being greater than zero? Or maybe we want the conditions to reflect this value (0) as well, and as such, accommodate this in watch.go as well?
The latter would make more sense as in we keep the same expectations as cache.ListerWatcher allows upstream.
There was a problem hiding this comment.
It throws an error because we assume that a negative object limit is invalid. If it's zero, we don't want to throw an error because this is the default and means that it's not set.
fcd7a8b to
468dfed
Compare
00bacd5 to
9ec4b78
Compare
This change allows user-controlled limits on how many objects KSM will list from the API. This is helpful to prevent resource exhaustion on KSM, in case the API creates too many resources. The object limit it set globally and applied per resource watched.
|
@rexagod I added another e2e test, please take a look if you have some spare cycles. |
dgrisonnet
left a comment
There was a problem hiding this comment.
With sharding the behavior will be a bit different as we filter some of the objects after the LIST: https://github.com/kubernetes/kube-state-metrics/blob/main/pkg/sharding/listwatch.go#L63-L71
But that should be fine as the intent here to reduce the pressure on the apiserver and we can't apply the sharding filter during the LIST call
| expectedType interface{}, | ||
| listWatchFunc func(customResourceClient interface{}, ns string, fieldSelector string) cache.ListerWatcher, | ||
| useAPIServerCache bool, | ||
| useAPIServerCache bool, objectLimit int64, |
There was a problem hiding this comment.
+1. I don't see any issues with supporting this in RSM, tracking this here: rexagod/resource-state-metrics#4.
|
/lgtm For others to take a look. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mrueg, rexagod The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/unhold |
What this PR does / why we need it:
This change allows user-controlled limits on how many objects KSM will list from the API. This is helpful to prevent resource exhaustion on KSM, in case the API creates too many resources.
The object limit it set globally and applied per resource watched.
This is currently a WIP as I'm not sure if it will work as expected and it needs further testing.
How does this change affect the cardinality of KSM: (increases, decreases or does not change cardinality)
Introduces a new metric on ksm telemetry, which is static over KSM config lifetime:
kube_state_metrics_list_limitThis will help with alerting when a threshold is reached.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #2622