Skip to content

For the complete documentation index and AI-optimized content, see /llms.txt. All pages support markdown format via .md extension or Accept: text/markdown header.

Autoscaling Checks

For the complete documentation index and AI-optimized content, see /llms.txt. All pages support markdown format via .md extension or Accept: text/markdown header.

Autoscaling Checks are small canary workloads and runner Pods that validate the autoscaling control plane in a real cluster. They check that signal generation, metric delivery, KEDA, HPA, and Deployment replica changes work together. They are not intended to test application business logic.

The checks expose kedify_autoscaling_check_duration_seconds, a gauge with the latest complete check iteration duration in seconds. Use it for alerts when autoscaling takes longer than your operating threshold.

The Helm chart installs separate runner and target apps for each enabled check. HTTP and CPU are enabled by default. Prometheus and memory are opt-in because they depend on additional cluster setup or take longer to settle.

CheckSignal sourceValidates
HTTPIn-cluster HTTP requestsKedify HTTP scaler metric, HPA metric, HPA activation, scale-up/down
PrometheusWork-simulator sample app metricSource Prometheus metric, KEDA metric, HPA metric, scale-up/down
CPULoad-generator sample app CPU profilemetrics.k8s.io, HPA CPU metric, HPA activation, scale-up/down
MemoryLoad-generator sample app memory loadmetrics.k8s.io, HPA memory metric, HPA activation, scale-up/down

Each check has its own runner and target. The runner creates the autoscaling signal, validates the metric and HPA path, waits for the target to scale up and back down, and exposes kedify_autoscaling_check_* metrics.

HTTP scaler check:

check-http-runner
-> Kedify HTTP proxy
-> check-http-target
-> KEDA HTTP external metric
-> KEDA-generated HPA
-> Deployment replicas
-> kedify_autoscaling_check_* metrics

Prometheus scaler check:

check-prometheus-runner
-> check-prometheus-target work requests
-> work_simulator_inprogress_tasks
-> Prometheus scrape and query
-> KEDA Prometheus scaler
-> KEDA-generated HPA
-> Deployment replicas
-> kedify_autoscaling_check_* metrics

CPU resource check:

check-cpu-runner
-> check-cpu-target CPU profile
-> metrics.k8s.io CPU samples
-> KEDA CPU scaler
-> KEDA-generated HPA
-> Deployment replicas
-> kedify_autoscaling_check_* metrics

Memory resource check:

check-memory-runner
-> check-memory-target memory profile
-> metrics.k8s.io memory samples
-> KEDA memory scaler
-> KEDA-generated HPA
-> Deployment replicas
-> kedify_autoscaling_check_* metrics
  • Kedify/KEDA is installed in the cluster.
  • The Kedify HTTP scaler is installed if the HTTP check is enabled.
  • metrics.k8s.io is available if the CPU or memory check is enabled.
  • Prometheus is reachable from the check runner Pods if the Prometheus check is enabled.
  • Prometheus scrapes the bundled work-simulator target if the Prometheus check is enabled.
  • The Kedify Agent version supports autoscaling-check Service discovery.

Install the OCI chart from GHCR. If the package is still private, authenticate first:

Terminal window
echo "$GITHUB_TOKEN" | helm registry login ghcr.io --username <github-user> --password-stdin

Install the default HTTP and CPU checks:

Terminal window
helm upgrade --install autoscaling-checks oci://ghcr.io/kedify/charts/autoscaling-checks \
--version <version> \
--namespace autoscaling-checks \
--create-namespace

Enable the memory check when you want to cover memory resource scaling too:

Terminal window
helm upgrade --install autoscaling-checks oci://ghcr.io/kedify/charts/autoscaling-checks \
--version <version> \
--namespace autoscaling-checks \
--create-namespace \
--set memory.enabled=true

If you are testing from a private checkout, use the local chart path and set the image tag explicitly:

Terminal window
helm upgrade --install autoscaling-checks ./charts/autoscaling-checks \
--namespace autoscaling-checks \
--create-namespace \
--set image.repository=ghcr.io/kedify/autoscaling-checks \
--set image.tag=<version>

Enable or disable checks explicitly:

Terminal window
helm upgrade --install autoscaling-checks oci://ghcr.io/kedify/charts/autoscaling-checks \
--version <version> \
--namespace autoscaling-checks \
--create-namespace \
--set prometheus.enabled=true \
--set memory.enabled=true \
--set cpu.enabled=false

The autoscaling-checks namespace is recommended. Other namespaces work as long as the runner Services keep the chart labels and the Kedify Agent can read Services.

Set the Prometheus HTTP API address and the source query for the Prometheus check:

Terminal window
helm upgrade --install autoscaling-checks oci://ghcr.io/kedify/charts/autoscaling-checks \
--version <version> \
--namespace autoscaling-checks \
--create-namespace \
--set prometheus.enabled=true \
--set prometheus.serverAddress=http://kube-prometheus-stack-prometheus.monitoring.svc.cluster.local:9090 \
--set 'prometheus.sourceQuery=sum(work_simulator_inprogress_tasks)'

If you use Prometheus Operator, enable ServiceMonitors for the runner metrics and the Prometheus target sample app:

Terminal window
helm upgrade --install autoscaling-checks oci://ghcr.io/kedify/charts/autoscaling-checks \
--version <version> \
--namespace autoscaling-checks \
--create-namespace \
--set monitoring.serviceMonitor.enabled=true \
--set prometheus.targetServiceMonitor.enabled=true

If your Prometheus uses pod annotations, the chart enables scrape annotations by default with monitoring.scrapeAnnotations=true.

The Kedify Agent discovers autoscaling-check runner Services automatically. No extra metrics endpoint setting is required.

The runner Services must have these labels:

app.kubernetes.io/name=autoscaling-checks
app.kubernetes.io/part-of=autoscaling-checks
autoscaling-checks.kedify.io/role=runner

The Helm chart sets these labels and exposes a Service port named metrics. The dashboard shows the results in the cluster detail view under the Autoscaling Checks tab.

Use kedify_autoscaling_check_duration_seconds for the primary alert. It records the latest complete iteration duration and includes bounded labels for the overall result and each step.

Example Prometheus alert:

groups:
- name: kedify-autoscaling-checks
rules:
- alert: KedifyAutoscalingCheckSlow
expr: kedify_autoscaling_check_duration_seconds > 300
for: 2m
labels:
severity: warning
annotations:
summary: Kedify autoscaling check is slow
description: Check {{ $labels.check }} took {{ $value }}s in the latest iteration.

Use the step labels and kedify_autoscaling_check_step_duration_seconds to identify whether the delay is in signal generation, source metrics, KEDA metrics, HPA metrics, HPA activation, scale-up, or scale-down.

Wait for the runner and target Deployments:

Terminal window
kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-http-runner
kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-http-target
kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-cpu-runner
kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-cpu-target

If optional checks are enabled, wait for those runners and targets too:

Terminal window
kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-prometheus-runner
kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-prometheus-target
kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-memory-runner
kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-memory-target

Inspect one runner directly:

Terminal window
kubectl -n autoscaling-checks port-forward svc/autoscaling-checks-check-http-runner 8080:8080
curl -s localhost:8080/healthz
curl -s localhost:8080/metrics

Check the generated autoscaling resources:

Terminal window
kubectl -n autoscaling-checks get scaledobject,hpa,deploy,svc -l app.kubernetes.io/part-of=autoscaling-checks

If the dashboard tab is empty, verify that the agent can read Services and that the runner Services have the labels shown above and a metrics port.

If the Prometheus check fails, query Prometheus directly with the configured prometheus.sourceQuery and confirm it returns the expected value during a check run.

If the CPU or memory check fails before scaling up, confirm that metrics.k8s.io returns Pod resource metrics in the autoscaling-checks namespace.

If scale-down takes too long, check the ScaledObject cooldown values and the chart common.scaleDownTimeout setting.

Terminal window
helm uninstall autoscaling-checks --namespace autoscaling-checks