Skip to content

Latest commit

 

History

History
1339 lines (1122 loc) · 47.4 KB

File metadata and controls

1339 lines (1122 loc) · 47.4 KB

Custom Resource State Metrics

This section describes how to add metrics based on the state of a custom resource without writing a custom resource registry and running your own build of KSM.

Configuration

A YAML configuration file described below is required to define your custom resources and the fields to turn into metrics.

Two flags can be used:

  • --custom-resource-state-config "inline yaml (see example)" or
  • --custom-resource-state-config-file /path/to/config.yaml

When using a --config file, the equivalent YAML key is custom_resource_state_config_file. The pre-v2.17 key custom_resource_config_file is still honored as a deprecated alias and will be removed in a future release; please migrate to custom_resource_state_config_file.

If both flags are provided, the inline configuration will take precedence. When multiple entries for the same resource exist, kube-state-metrics will exit with an error. This includes configuration which refers to a different API version.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kube-state-metrics
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - name: kube-state-metrics
        args:
          - --custom-resource-state-config
          # in YAML files, | allows a multi-line string to be passed as a flag value
          # see https://yaml-multiline.info
          -  |
              kind: CustomResourceStateMetrics
              spec:
                resources:
                  - groupVersionKind:
                      group: myteam.io
                      version: "v1"
                      kind: Foo
                    metrics:
                      - name: active_count
                        help: "Count of active Foo"
                        each:
                          type: Gauge
                          ...

It's also possible to configure kube-state-metrics to run in a custom-resource-mode only. In addition to specifying one of --custom-resource-state-config* flags, you could set --custom-resource-state-only to true. With this configuration only the known custom resources configured in --custom-resource-state-config* will be taken into account by kube-state-metrics.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kube-state-metrics
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - name: kube-state-metrics
        args:
          - --custom-resource-state-config
          # in YAML files, | allows a multi-line string to be passed as a flag value
          # see https://yaml-multiline.info
          -  |
              kind: CustomResourceStateMetrics
              spec:
                resources:
                  - groupVersionKind:
                      group: myteam.io
                      version: "v1"
                      kind: Foo
                    metrics:
                      - name: active_count
                        help: "Count of active Foo"
                        each:
                          type: Gauge
                          ...
          - --custom-resource-state-only=true

NOTE: The customresource_group, customresource_version, and customresource_kind common labels are reserved, and will be overwritten by the values from the groupVersionKind field.

RBAC-enabled Clusters

Please be aware that kube-state-metrics needs list and watch permissions granted to customresourcedefinitions.apiextensions.k8s.io as well as to the resources you want to gather metrics from.

Examples

The examples in this section will use the following custom resource:

kind: Foo
apiVersion: myteam.io/vl
metadata:
    annotations:
        bar: baz
        qux: quxx
    labels:
        foo: bar
    name: foo
spec:
    version: v1.2.3
    order:
        - id: 1
          value: true
        - id: 3
          value: false
    replicas: 1
    refs:
        - my_other_foo
        - foo_2
        - foo_with_extensions
status:
    phase: Pending
    active:
        type-a: 1
        type-b: 3
    conditions:
        - name: a
          value: 45
        - name: b
          value: 66
    sub:
        type-a:
            active: 1
            ready: 2
        type-b:
            active: 3
            ready: 4
    uptime: 43.21

Single Values

The config:

kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind:
        group: myteam.io
        kind: "Foo"
        version: "v1"
      metrics:
        - name: "uptime"
          help: "Foo uptime"
          each:
            type: Gauge
            gauge:
              path: [status, uptime]

Produces the metric:

kube_customresource_uptime{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1"} 43.21

Multiple Metrics/Kitchen Sink

kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind:
        group: myteam.io
        kind: "Foo"
        version: "v1"
      # labels can be added to all metrics from a resource
      commonLabels:
        crd_type: "foo"
      labelsFromPath:
        name: [metadata, name]
      metrics:
        - name: "ready_count"
          help: "Number Foo Bars ready"
          each:
            type: Gauge
            gauge:
              # targeting an object or array will produce a metric for each element
              # labelsFromPath and value are relative to this path
              path: [status, sub]

              # if path targets an object, the object key will be used as label value
              # This is not supported for StateSet type as all values will be truthy, which is redundant.
              labelFromKey: type
              # label values can be resolved specific to this path 
              labelsFromPath:
                active: [active]
              # The actual field to use as metric value. Should be a number, boolean or RFC3339 timestamp string.
              valueFrom: [ready]
          commonLabels:
            custom_metric: "yes"
          labelsFromPath:
            # whole objects may be copied into labels by prefixing with "*"
            # *anything will be copied into labels, with the highest sorted * strings first
            "*": [metadata, labels]
            # a prefix before the asterisk will be used as a label prefix
            "lorem_*": [metadata, annotations]
            "**": [metadata, annotations]
            
            # or specific fields may be copied. these fields will always override values from *s
            name: [metadata, name]
            foo: [metadata, labels, foo]

Produces the following metrics:

kube_customresource_ready_count{customresource_group="myteam.io", customresource_kind="Foo", 
customresource_version="v1", active="1",custom_metric="yes",foo="bar",name="foo",bar="baz",qux="quxx",type="type-a",
lorem_bar="baz",lorem_qux="quxx",} 2
kube_customresource_ready_count{customresource_group="myteam.io", customresource_kind="Foo", 
customresource_version="v1", active="3",custom_metric="yes",foo="bar",name="foo",bar="baz",qux="quxx",type="type-b",
lorem_bar="baz",lorem_qux="quxx",} 4

CEL Expressions Quick Start

CEL (Common Expression Language) expressions provide a powerful way to transform and compute metrics from your custom resources. Here's a quick example:

kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind:
        group: myteam.io
        kind: "Foo"
        version: "v1"
      labelsFromPath:
        name: [metadata, name]
      metrics:
        # Calculate a percentage
        - name: "resource_utilization_percent"
          help: "Resource utilization as a percentage"
          each:
            type: Gauge
            gauge:
              path: [status, resources]
              valueFrom:
                celExpr: "(double(value.used) / double(value.total)) * 100.0"
        
        # Conditional metric
        - name: "is_healthy"
          help: "Whether the resource is healthy"
          each:
            type: Gauge
            gauge:
              path: [status, state]
              valueFrom:
                celExpr: "value == 'Running' ? 1.0 : 0.0"
        
        # Return value with dynamic labels
        - name: "capacity_status"
          help: "Capacity with dynamic status label"
          each:
            type: Gauge
            gauge:
              path: [status, resources]
              valueFrom:
                celExpr: |
                  WithLabels(
                    double(value.used) / double(value.total) * 100.0,
                    {'status': value.used > value.warning ? 'warning' : 'ok'}
                  )

Given this custom resource:

kind: Foo
apiVersion: myteam.io/v1
metadata:
  name: example-foo
status:
  resources:
    used: 75
    total: 100
    warning: 80
  state: "Running"

You get these metrics:

kube_customresource_resource_utilization_percent{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", name="example-foo"} 75.0
kube_customresource_is_healthy{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", name="example-foo"} 1.0
kube_customresource_capacity_status{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", name="example-foo", status="ok"} 75.0

For detailed documentation, examples, and migration guides, see the CEL Expressions section in the Gauge metric type documentation below.

Non-map Arrays

kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind:
        group: myteam.io
        kind: "Foo"
        version: "v1"
      labelsFromPath:
        name: [metadata, name]
      metrics:
        - name: "ref_info"
          help: "Reference to other Foo"
          each:
            type: Info
            info:
              # targeting an array will produce a metric for each element
              # labelsFromPath and value are relative to this path
              path: [spec, refs]

              # if path targets a list of values (e.g. strings or numbers, not objects or maps), individual values can
              # referenced by a label using this syntax
              labelsFromPath:
                ref: []

Produces the following metrics:

kube_customresource_ref_info{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", name="foo",ref="my_other_foo"} 1
kube_customresource_ref_info{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", name="foo",ref="foo_2"} 1
kube_customresource_ref_info{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", name="foo",ref="foo_with_extensions"} 1

Same Metrics with Different Labels

  recommendation:
    containerRecommendations:
    - containerName: consumer
      lowerBound:
        cpu: 100m
        memory: 262144k

For example in VPA we have above attributes and we want to have a same metrics for both CPU and Memory, you can use below config:

kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind:
        group: autoscaling.k8s.io
        kind: "VerticalPodAutoscaler"
        version: "v1"
      labelsFromPath:
        verticalpodautoscaler: [metadata, name]
        namespace: [metadata, namespace]
        target_api_version: [apiVersion]
        target_kind: [spec, targetRef, kind]
        target_name: [spec, targetRef, name]
      metrics:
        # for memory
        - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound"
          help: "Minimum memory resources the container can use before the VerticalPodAutoscaler updater evicts it."
          commonLabels:
            unit: "byte"
            resource: "memory"
          each:
            type: Gauge
            gauge:
              path: [status, recommendation, containerRecommendations]
              labelsFromPath:
                container: [containerName]
              valueFrom: [lowerBound, memory]
        # for CPU
        - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound"
          help: "Minimum cpu resources the container can use before the VerticalPodAutoscaler updater evicts it."
          commonLabels:
            unit: "core"
            resource: "cpu"
          each:
            type: Gauge
            gauge:
              path: [status, recommendation, containerRecommendations]
              labelsFromPath:
                container: [containerName]
              valueFrom: [lowerBound, cpu]

Produces the following metrics:

# HELP kube_customresource_verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound Minimum memory resources the container can use before the VerticalPodAutoscaler updater evicts it.
# TYPE kube_customresource_verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound gauge
kube_customresource_verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound{container="consumer",customresource_group="autoscaling.k8s.io",customresource_kind="VerticalPodAutoscaler",customresource_version="v1",namespace="namespace-example",resource="memory",target_api_version="apps/v1",target_kind="Deployment",target_name="target-name-example",unit="byte",verticalpodautoscaler="vpa-example"} 123456
# HELP kube_customresource_verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound Minimum cpu resources the container can use before the VerticalPodAutoscaler updater evicts it.
# TYPE kube_customresource_verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound gauge
kube_customresource_verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound{container="consumer",customresource_group="autoscaling.k8s.io",customresource_kind="VerticalPodAutoscaler",customresource_version="v1",namespace="namespace-example",resource="cpu",target_api_version="apps/v1",target_kind="Deployment",target_name="target-name-example",unit="core",verticalpodautoscaler="vpa-example"} 0.1

VerticalPodAutoscaler

In v2.9.0 the vericalpodautoscalers resource was removed from the list of default resources. In order to generate metrics for verticalpodautoscalers, you can use the following Custom Resource State config:

# Using --resource=verticalpodautoscalers, we get the following output:
# HELP kube_verticalpodautoscaler_annotations Kubernetes annotations converted to Prometheus labels.
# TYPE kube_verticalpodautoscaler_annotations gauge
# kube_verticalpodautoscaler_annotations{namespace="default",verticalpodautoscaler="hamster-vpa",target_api_version="apps/v1",target_kind="Deployment",target_name="hamster"} 1
# A similar result can be achieved by specifying the following in --custom-resource-state-config:
kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind:
        group: autoscaling.k8s.io
        kind: "VerticalPodAutoscaler"
        version: "v1"
      labelsFromPath:
        verticalpodautoscaler: [metadata, name]
        namespace: [metadata, namespace]
        target_api_version: [apiVersion]
        target_kind: [spec, targetRef, kind]
        target_name: [spec, targetRef, name]
      metrics:
        - name: "annotations"
          help: "Kubernetes annotations converted to Prometheus labels."
          each:
            type: Gauge
            gauge:
              path: [metadata, annotations]
# This will output the following metric:
# HELP kube_customresource_autoscaling_annotations Kubernetes annotations converted to Prometheus labels.
# TYPE kube_customresource_autoscaling_annotations gauge
# kube_customresource_autoscaling_annotations{customresource_group="autoscaling.k8s.io", customresource_kind="VerticalPodAutoscaler", customresource_version="v1", namespace="default",target_api_version="autoscaling.k8s.io/v1",target_kind="Deployment",target_name="hamster",verticalpodautoscaler="hamster-vpa"} 123

The above configuration was tested on this VPA configuration, with an added annotation (foo: 123).

All VerticalPodAutoscaler Metrics

As an addition for the above configuration, here's the complete CustomResourceStateMetrics spec to re-enable all of the VPA metrics which are removed from the list of the default resources:

VPA CustomResourceStateMetrics
kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind:
        group: autoscaling.k8s.io
        kind: "VerticalPodAutoscaler"
        version: "v1"
      labelsFromPath:
        namespace: [metadata, namespace]
        target_api_version: [spec, targetRef, apiVersion]
        target_kind: [spec, targetRef, kind]
        target_name: [spec, targetRef, name]
        verticalpodautoscaler: [metadata, name]
      metricNamePrefix: "kube"
      metrics:
        # kube_verticalpodautoscaler_annotations
        - name: "verticalpodautoscaler_annotations"
          help: "Kubernetes annotations converted to Prometheus labels."
          each:
            type: Info
            info:
              labelsFromPath:
                annotation_*: [metadata, annotations]
                name: [metadata, name]
        # kube_verticalpodautoscaler_labels
        - name: "verticalpodautoscaler_labels"
          help: "Kubernetes labels converted to Prometheus labels."
          each:
            type: Info
            info:
              labelsFromPath:
                label_*: [metadata, labels]
                name: [metadata, name]
        # kube_verticalpodautoscaler_spec_updatepolicy_updatemode
        - name: "verticalpodautoscaler_spec_updatepolicy_updatemode"
          help: "Update mode of the VerticalPodAutoscaler."
          each:
            type: StateSet
            stateSet:
              labelName: "update_mode"
              path: [spec, updatePolicy, updateMode]
              list: ["Auto", "Initial", "Off", "Recreate"]
        # Memory kube_verticalpodautoscaler_spec_resourcepolicy_container_policies_minallowed_memory
        - name: "verticalpodautoscaler_spec_resourcepolicy_container_policies_minallowed_memory"
          help: "Minimum memory resources the VerticalPodAutoscaler can set for containers matching the name."
          commonLabels:
            unit: "byte"
            resource: "memory"
          each:
            type: Gauge
            gauge:
              path: [spec, resourcePolicy, containerPolicies]
              labelsFromPath:
                container: [containerName]
              valueFrom: [minAllowed, memory]
        # CPU kube_verticalpodautoscaler_spec_resourcepolicy_container_policies_minallowed_cpu
        - name: "verticalpodautoscaler_spec_resourcepolicy_container_policies_minallowed_cpu"
          help: "Minimum cpu resources the VerticalPodAutoscaler can set for containers matching the name."
          commonLabels:
            unit: "core"
            resource: "cpu"
          each:
            type: Gauge
            gauge:
              path: [spec, resourcePolicy, containerPolicies]
              labelsFromPath:
                container: [containerName]
              valueFrom: [minAllowed, cpu]
        # Memory kube_verticalpodautoscaler_spec_resourcepolicy_container_policies_maxallowed_memory
        - name: "verticalpodautoscaler_spec_resourcepolicy_container_policies_maxallowed_memory"
          help: "Maximum memory resources the VerticalPodAutoscaler can set for containers matching the name."
          commonLabels:
            unit: "byte"
            resource: "memory"
          each:
            type: Gauge
            gauge:
              path: [spec, resourcePolicy, containerPolicies]
              labelsFromPath:
                container: [containerName]
              valueFrom: [maxAllowed, memory]
        # CPU kube_verticalpodautoscaler_spec_resourcepolicy_container_policies_maxallowed_cpu
        - name: "verticalpodautoscaler_spec_resourcepolicy_container_policies_maxallowed_cpu"
          help: "Maximum cpu resources the VerticalPodAutoscaler can set for containers matching the name."
          commonLabels:
            unit: "core"
            resource: "cpu"
          each:
            type: Gauge
            gauge:
              path: [spec, resourcePolicy, containerPolicies]
              labelsFromPath:
                container: [containerName]
              valueFrom: [maxAllowed, cpu]
        # Memory kube_verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound_memory
        - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound_memory"
          help: "Minimum memory resources the container can use before the VerticalPodAutoscaler updater evicts it."
          commonLabels:
            unit: "byte"
            resource: "memory"
          each:
            type: Gauge
            gauge:
              path: [status, recommendation, containerRecommendations]
              labelsFromPath:
                container: [containerName]
              valueFrom: [lowerBound, memory]
        # CPU kube_verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound_cpu
        - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound_cpu"
          help: "Minimum cpu resources the container can use before the VerticalPodAutoscaler updater evicts it."
          commonLabels:
            unit: "core"
            resource: "cpu"
          each:
            type: Gauge
            gauge:
              path: [status, recommendation, containerRecommendations]
              labelsFromPath:
                container: [containerName]
              valueFrom: [lowerBound, cpu]
        # Memory kube_verticalpodautoscaler_status_recommendation_containerrecommendations_upperbound_memory
        - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_upperbound_memory"
          help: "Maximum memory resources the container can use before the VerticalPodAutoscaler updater evicts it."
          commonLabels:
            unit: "byte"
            resource: "memory"
          each:
            type: Gauge
            gauge:
              path: [status, recommendation, containerRecommendations]
              labelsFromPath:
                container: [containerName]
              valueFrom: [upperBound, memory]
        # CPU kube_verticalpodautoscaler_status_recommendation_containerrecommendations_upperbound_cpu
        - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_upperbound_cpu"
          help: "Maximum cpu resources the container can use before the VerticalPodAutoscaler updater evicts it."
          commonLabels:
            unit: "core"
            resource: "cpu"
          each:
            type: Gauge
            gauge:
              path: [status, recommendation, containerRecommendations]
              labelsFromPath:
                container: [containerName]
              valueFrom: [upperBound, cpu]
        # Memory kube_verticalpodautoscaler_status_recommendation_containerrecommendations_target_memory
        - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_target_memory"
          help: "Target memory resources the VerticalPodAutoscaler recommends for the container."
          commonLabels:
            unit: "byte"
            resource: "memory"
          each:
            type: Gauge
            gauge:
              path: [status, recommendation, containerRecommendations]
              labelsFromPath:
                container: [containerName]
              valueFrom: [target, memory]
        # CPU kube_verticalpodautoscaler_status_recommendation_containerrecommendations_target_cpu
        - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_target_cpu"
          help: "Target cpu resources the VerticalPodAutoscaler recommends for the container."
          commonLabels:
            unit: "core"
            resource: "cpu"
          each:
            type: Gauge
            gauge:
              path: [status, recommendation, containerRecommendations]
              labelsFromPath:
                container: [containerName]
              valueFrom: [target, cpu]
        # Memory kube_verticalpodautoscaler_status_recommendation_containerrecommendations_uncappedtarget_memory
        - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_uncappedtarget_memory"
          help: "Target memory resources the VerticalPodAutoscaler recommends for the container ignoring bounds."
          commonLabels:
            unit: "byte"
            resource: "memory"
          each:
            type: Gauge
            gauge:
              path: [status, recommendation, containerRecommendations]
              labelsFromPath:
                container: [containerName]
              valueFrom: [uncappedTarget, memory]
        # CPU kube_verticalpodautoscaler_status_recommendation_containerrecommendations_uncappedtarget_cpu
        - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_uncappedtarget_cpu"
          help: "Target memory resources the VerticalPodAutoscaler recommends for the container ignoring bounds."
          commonLabels:
            unit: "core"
            resource: "cpu"
          each:
            type: Gauge
            gauge:
              path: [status, recommendation, containerRecommendations]
              labelsFromPath:
                container: [containerName]
              valueFrom: [uncappedTarget, cpu]

Metric types

The configuration supports three kind of metrics from the OpenMetrics specification.

The metric type is specified by the type field and its specific configuration at the types specific struct.

Gauge

Gauges are current measurements, such as bytes of memory currently used or the number of items in a queue. For gauges the absolute value is what is of interest to a user. [0]

Example:

kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind:
        group: myteam.io
        kind: "Foo"
        version: "v1"
      metrics:
        - name: "uptime"
          help: "Foo uptime"
          each:
            type: Gauge
            gauge:
              path: [status, uptime]

Produces the metric:

kube_customresource_uptime{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1"} 43.21
Type conversion and special handling

Gauges produce values of type float64 but custom resources can be of all kinds of types. Kube-state-metrics performs implicit type conversions for a lot of type. Supported types are:

  • (u)int32/64, int, float32 and byte are cast to float64
  • nil is generally mapped to 0.0 if NilIsZero is true, otherwise it will throw an error
  • for bool true is mapped to 1.0 and false is mapped to 0.0
  • for string the following logic applies
CEL Expressions

CEL (Common Expression Language) expressions offer a powerful alternative to path-based value extraction, enabling transformations, calculations, iterations over collections, and dynamic label generation. CEL is also used in various other parts of the Kubernetes Ecosystem.

Basic Syntax:

Use celExpr within the valueFrom object. The value variable contains the data at the specified path:

gauge:
  path: [status, replicas]
  valueFrom:
    celExpr: "double(value) * 2.0"

Simple Transformations:

gauge:
  path: [status, count]
  valueFrom:
    celExpr: "double(value) * 2.0"

gauge:
  path: [status, state]
  valueFrom:
    celExpr: "value == 'active' ? 1.0 : 0.0"

gauge:
  path: [status, resources]
  valueFrom:
    celExpr: "(double(value.used) / double(value.total)) * 100.0"

gauge:
  path: [spec, config]
  valueFrom:
    celExpr: "has(value.timeout) ? double(value.timeout) : 30.0"

gauge:
  path: [metadata]
  labelsFromPath:
    name: ["name"]
    namespace: ["namespace"]
  valueFrom:
    celExpr: "1.0"

The WithLabels Function:

Return both a metric value and additional labels:

WithLabels(metricValue, labelMap)

Examples:

gauge:
  path: [status, uptime]
  valueFrom:
    celExpr: "WithLabels(double(value), {'unit': 'seconds'})"

gauge:
  path: [status, resources]
  valueFrom:
    celExpr: |
      WithLabels(
        double(value.used) / double(value.total) * 100.0,
        {'threshold': value.used > value.warning ? 'high' : 'normal'}
      )

gauge:
  path: [spec, replicas]
  valueFrom:
    celExpr: "WithLabels(double(value), {'scaled': value > 1 ? 'yes' : 'no'})"

Label value stringification: Label map values may be of any type. Non-string values are automatically stringified (via fmt.Sprintf("%v")). If you need precise control over the string output (e.g. fixed decimals, padding, hex), use the format() function from CEL's Strings extension.

Label Precedence:

When labels come from multiple sources, they're merged with this precedence (highest to lowest):

  1. AdditionalLabels from WithLabels()
  2. labelsFromPath
  3. commonLabels
  4. Standard labels: customresource_group, customresource_version, customresource_kind

Migrating from Path-Based Extraction:

Iterating over objects:

# Path-based
gauge:
  path: [status, active]
  labelFromKey: "type"

# CEL
gauge:
  path: [status, active]
  valueFrom:
    celExpr: "value.map(k, WithLabels(value[k], {'type': k}))"

Iterating over arrays with label extraction:

# Path-based
gauge:
  path: [status, conditions]
  labelsFromPath:
    type: ["type"]
  valueFrom: ["status"]

# CEL
gauge:
  path: [status, conditions]
  valueFrom:
    celExpr: "value.map(c, WithLabels(c.status, {'type': c.type}))"

Deep object navigation:

# Path-based
gauge:
  path: [status, sub]
  labelsFromPath:
    active: ["active"]
  valueFrom: ["ready"]
  labelFromKey: "type"

# CEL
gauge:
  path: [status, sub]
  valueFrom:
    celExpr: "value.map(k, WithLabels(value[k].ready, {'type': k, 'active': value[k].active}))"

Accessing nested fields:

# Path-based
gauge:
  path: [metadata]
  valueFrom: ["creationTimestamp"]

# CEL (dot notation)
gauge:
  path: [metadata]
  valueFrom:
    celExpr: "value.creationTimestamp"

Complex Data Structures:

gauge:
  path: [status, services]
  valueFrom:
    celExpr: "value.map(name, WithLabels(value[name].requestCount, {'service': name}))"

gauge:
  path: [status, pods]
  valueFrom:
    celExpr: "double(value.filter(p, p.ready).size())"

gauge:
  path: [status, replicas]
  valueFrom:
    celExpr: "value.map(r, WithLabels(r.count, {'zone': r.zone, 'ready': string(r.ready)}))"

Combining CEL with labelsFromPath:

Labels from both sources are merged (with CEL's WithLabels taking precedence):

gauge:
  path: [metadata]
  labelsFromPath:
    name: ["name"]
    namespace: ["namespace"]
  valueFrom:
    celExpr: "WithLabels(1.0, {'source': 'cel', 'name': 'override'})"

Result: name="override" (from CEL), namespace="..." (from labelsFromPath), source="cel" (from CEL).

Important: When CEL returns multiple values (e.g., .map()), labelsFromPath evaluates against the original value at path, not each returned element. Use WithLabels for per-element labels:

gauge:
  path: [status, conditions]
  labelsFromPath:
    name: ["name"]
  valueFrom:
    celExpr: "value.map(c, WithLabels(c.status, {'type': c.type}))"

Allowed Return Types:

CEL expressions can return values that will be converted to float64 metric values using the same type conversion rules described in Type conversion and special handling. Supported return types:

  • Single value: Any value convertible to float64 (e.g., 42, true, "123", "2024-01-01T00:00:00Z", "250m", "50%") - produces a single metric
  • List of values: An array/list of values convertible to float64 (e.g., [1, 2, 3], ["true", "false"]) - produces multiple metrics, one for each element
  • WithLabels result: The result of calling WithLabels(metricValue, labelMap) where metricValue is any value convertible to float64 - produces a single metric with additional labels
  • List of WithLabels results: An array/list of WithLabels results (e.g., value.map(k, WithLabels(value[k], {'key': k}))) - produces multiple metrics, each with their own labels

Values are converted using the same logic as path-based extraction: booleans, integers, strings (including dates, quantities, percentages), and other types are automatically converted to float64. See the type conversion section above for details.

Available CEL Functions:

CEL provides many built-in functions:

  • Type conversions: double(), int(), string(), bool()
  • String operations: size(), contains(), startsWith(), endsWith()
  • Collections: map(), filter(), all(), exists(), size()
  • Conditionals: condition ? trueValue : falseValue
  • Field checks: has(value.field)
  • Arithmetic: +, -, *, /, %
  • Comparisons: ==, !=, <, >, <=, >=

See the CEL language definition for complete documentation.

Extensions:

In addition to the built-in macros and functions, kube-state-metrics enables several official cel-go extension libraries. The set of functions available to a CEL expression is fixed by the version of each extension that kube-state-metrics is built with — see the table below for the versions currently in use.

Extension Version Documentation
Strings 4 ext/README.md#strings, cel-spec strings.md
Lists 3 ext/README.md#lists
Sets 1 ext/README.md#sets
Math 2 ext/README.md#math
Bindings 1 ext/README.md#bindings
TwoVarComprehensions 1 ext/README.md#twovarcomprehensions

If you need a function that was added in a later version of an extension, bump the corresponding *Version(N) argument in newCELValueExtractor inside cel_value_extractor.go.

Examples:

The examples below assume value is the result of resolving the path defined alongside celExpr. Each example shows a sample resolved value, the celExpr you might configure, and the metric(s) it would produce. Standard custom-resource identity labels (customresource_group, customresource_kind, customresource_version, namespace, name, etc.) are always attached by kube-state-metrics and are omitted below for brevity.

Strings

Use when a string field needs to be parsed or normalized before being emitted as a label.

# value: {"name": "app-frontend-prod", "replicas": 3}
celExpr: "WithLabels(value.replicas, {'component': value.name.split('-')[1]})"
kube_customresource_app_replicas{component="frontend"} 3.0
# value: {"image": "nginx:1.25", "count": 5}
celExpr: "WithLabels(value.count, {'image': value.image.replace(':', '@')})"
kube_customresource_image_info{image="nginx@1.25"} 5.0
Lists

Use for aggregating or reordering list-shaped CR fields.

# value:
#   [{"type":"Ready","status":"True","lastTransitionTime":"2024-01-02T00:00:00Z"},
#    {"type":"Available","status":"True","lastTransitionTime":"2024-01-03T00:00:00Z"}]
celExpr: "value.sortBy(c, c.lastTransitionTime).reverse()[0].status"
kube_customresource_latest_condition_ready 1.0
# value: {"partitions": [[1, 2, 3], [4, 5]]}
celExpr: "value.partitions.flatten().size()"
kube_customresource_partition_entry_count 5.0
Sets

Use to gate or flag metrics based on set membership.

# value: {"type": "Pod", "allowedTypes": ["Pod", "Job"], "count": 5}
celExpr: "sets.contains(value.allowedTypes, [value.type]) ? double(value.count) : 0.0"
kube_customresource_allowed_resource_count 5.0
# value: {"conditions": [{"type":"Ready"}, {"type":"Initialized"}]}
celExpr: "sets.intersects(['Ready', 'Available'], value.conditions.map(c, c.type))"
kube_customresource_availability_signal 1.0
Math

Use for numeric transforms that plain arithmetic cannot express.

# value: {"replicas": -1}
celExpr: "math.greatest(double(value.replicas), 0.0)"
kube_customresource_effective_replicas 0.0
# value: {"used": 7, "total": 9}
celExpr: "math.round((double(value.used) / double(value.total)) * 100.0)"
kube_customresource_utilization_percent 78.0
Bindings

Use cel.bind(name, expr, body) to evaluate expr once and reference it by name inside body. Most useful when the same sub-expression feeds both the metric value and a derived label.

# value: {"used": 90, "total": 100}
celExpr: |
  cel.bind(ratio,
    double(value.used) / double(value.total),
    WithLabels(ratio, {'alert': ratio > 0.8 ? 'high' : 'ok'}))
kube_customresource_usage_ratio{alert="high"} 0.9
# value: {"used": 10, "total": 100}
celExpr: |
  cel.bind(ratio,
    double(value.used) / double(value.total),
    ratio > 0.8 ? ratio : 0.0)
kube_customresource_usage_ratio_if_high 0.0
TwoVarComprehensions

Adds transformList, transformMap, and two-variable forms of all/exists that expose both the index/key and the value. Useful when the index or key needs to appear as a label.

# value: [{"name":"app","restartCount":1}, {"name":"sidecar","restartCount":3}]
celExpr: |
  value.transformList(i, c,
    WithLabels(c.restartCount, {'index': string(i), 'name': c.name}))
kube_customresource_container_restarts{index="0", name="app"} 1.0
kube_customresource_container_restarts{index="1", name="sidecar"} 3.0
# value: [{"ready":true}, {"ready":true}]
celExpr: "value.all(i, r, r.ready)"
kube_customresource_all_replicas_ready 1.0

Backwards Compatibility:

Path-based valueFrom is still fully supported:

gauge:
  path: [status, conditions]
  valueFrom: [status]

gauge:
  path: [status, conditions]
  valueFrom:
    pathValueFrom: [status]

Note: You cannot use both celExpr and pathValueFrom in the same valueFrom configuration.


Example for status conditions on Kubernetes Controllers

This example demonstrates using Gauge metrics to track Kubernetes controller status conditions:

kind: CustomResourceStateMetrics
spec:
  resources:
  - groupVersionKind:
      group: myteam.io
      kind: "Foo"
      version: "v1"
    labelsFromPath:
      name:
      - metadata
      - name
      namespace:
      - metadata
      - namespace
    metrics:
    - name: "foo_status"
      help: "status condition "
      each:
        type: Gauge
        gauge:
          path: [status, conditions]
          labelsFromPath:
            type: ["type"]
          valueFrom: ["status"]

This will work for kubernetes controller CRs which expose status conditions according to the kubernetes api (https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#Condition):

status:
  conditions:
    - lastTransitionTime: "2019-10-22T16:29:31Z"
      status: "True"
      type: Ready

kube_customresource_foo_status{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", type="Ready"} 1.0

StateSet

StateSets represent a series of related boolean values, also called a bitset. If ENUMs need to be encoded this MAY be done via StateSet. [1]

kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind:
        group: myteam.io
        kind: "Foo"
        version: "v1"
      metrics:
        - name: "status_phase"
          help: "Foo status_phase"
          each:
            type: StateSet
            stateSet:
              labelName: phase
              path: [status, phase]
              list: [Pending, Bar, Baz]

Metrics of type StateSet will generate a metric for each value defined in list for each resource. The value will be 1, if the value matches the one in list.

Produces the metric:

kube_customresource_status_phase{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", phase="Pending"} 1
kube_customresource_status_phase{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", phase="Bar"} 0
kube_customresource_status_phase{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", phase="Baz"} 0

Info

Info metrics are used to expose textual information which SHOULD NOT change during process lifetime. Common examples are an application's version, revision control commit, and the version of a compiler. [2]

Metrics of type Info will always have a value of 1.

kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind:
        group: myteam.io
        kind: "Foo"
        version: "v1"
      metrics:
        - name: "version"
          help: "Foo version"
          each:
            type: Info
            info:
              labelsFromPath:
                version: [spec, version]

Produces the metric:

kube_customresource_version{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1", version="v1.2.3"} 1

Naming

The default metric names are prefixed to avoid collisions with other metrics. By default, a metric prefix of kube_ concatenated with your custom resource's group+version+kind is used. You can override this behavior with the metricNamePrefix field.

kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind: ...
      metricNamePrefix: myteam_foos
      metrics:
        - name: uptime
          # ...

Produces:

myteam_foos_uptime{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1"} 43.21

To omit namespace and/or subsystem altogether, set them to the empty string:

kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind: ...
      metricNamePrefix: ""
      metrics:
        - name: uptime
          # ...

Produces:

uptime{customresource_group="myteam.io", customresource_kind="Foo", customresource_version="v1"} 43.21

Logging

If a metric path is registered but not found on a custom resource, an error will be logged. For some resources, this may produce a lot of noise. The error log verbosity for a metric or resource can be set with errorLogV on the resource or metric:

kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind: ...
      errorLogV: 0  # 0 = default for errors
      metrics:
        - name: uptime
          errorLogV: 10  # only log at high verbosity

Path Syntax

Paths are specified as a list of strings. Each string is a path segment, resolved dynamically against the data of the custom resource. If any part of a path is missing, the result is nil.

Examples:

# simple path lookup
[spec, replicas]                         # spec.replicas == 1

# indexing an array
[spec, order, "0", value]                # spec.order[0].value = true

# finding an element in a list by key=value  
[status, conditions, "[name=a]", value]  # status.conditions[0].value = 45

# if the value to be matched is a number or boolean, the value is compared as a number or boolean  
[status, conditions, "[value=66]", name]  # status.conditions[1].name = "b"

# For generally matching against a field in an object schema, use the following syntax:
[metadata, "name=foo"] # if v, ok := metadata[name]; ok && v == "foo" { return v; } else { /* ignore */ }

Wildcard matching of version and kind fields

The Custom Resource State (CRS hereon) configuration also allows you to monitor all versions and/or kinds that come under a group. It watches the installed CRDs for this purpose. Taking the aforementioned Foo object as reference, the configuration below allows you to monitor all objects under all versions and all kinds that come under the myteam.io group.

kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind:
        group: "myteam.io"
        version: "*" # Set to `v1 to monitor all kinds under `myteam.io/v1`. Wildcard matches all installed versions that come under this group.
        kind: "*" # Set to `Foo` to monitor all `Foo` objects under the `myteam.io` group (under all versions). Wildcard matches all installed kinds that come under this group (and version, if specified).
      metrics:
        - name: "myobject_info"
          help: "Foo Bar Baz"
          each:
            type: Info
            info:
              path: [metadata]
              labelsFromPath:
                object: [name]
                namespace: [namespace]

The configuration above produces these metrics.

kube_customresource_myobject_info{customresource_group="myteam.io",customresource_kind="Foo",customresource_version="v1",namespace="ns",object="foo"} 1
kube_customresource_myobject_info{customresource_group="myteam.io",customresource_kind="Bar",customresource_version="v1",namespace="ns",object="bar"} 1

Note

  • For cases where the GVKs defined in a CRD have multiple versions under a single group for the same kind, as expected, the wildcard value will resolve to all versions, but a query for any specific version will return all resources under all versions, in that versions' representation. This basically means that for two such versions A and B, if a resource exists under B, it will reflect in the metrics generated for A as well, in addition to any resources of itself, and vice-versa. This logic is based on the current listing behavior of the client-go library.
  • The introduction of this feature further discourages (and discontinues) the use of native objects in the CRS featureset, since these do not have an explicit CRD associated with them, and conflict with internal stores defined specifically for such native resources. Please consider opening an issue or raising a PR if you'd like to expand on the current metric labelsets for them. Also, any such configuration will be ignored, and no metrics will be generated for the same.