For the complete documentation index and AI-optimized content, see /llms.txt. All pages support markdown format via .md extension or Accept: text/markdown header.

Autoscaling Checks

For the complete documentation index and AI-optimized content, see /llms.txt. All pages support markdown format via .md extension or Accept: text/markdown header.

Autoscaling Checks are small canary workloads and runner Pods that validate the autoscaling control plane in a real cluster. They check that signal generation, metric delivery, KEDA, HPA, and Deployment replica changes work together. They are not intended to test application business logic.

The checks expose kedify_autoscaling_check_duration_seconds, a gauge with the latest complete check iteration duration in seconds. Use it for alerts when autoscaling takes longer than your operating threshold.

What Gets Installed

The Helm chart installs separate runner and target apps for each enabled check. HTTP and CPU are enabled by default. Prometheus and memory are opt-in because they depend on additional cluster setup or take longer to settle.

Check	Signal source	Validates
HTTP	In-cluster HTTP requests	Kedify HTTP scaler metric, HPA metric, HPA activation, scale-up/down
Prometheus	Work-simulator sample app metric	Source Prometheus metric, KEDA metric, HPA metric, scale-up/down
CPU	Load-generator sample app CPU profile	`metrics.k8s.io`, HPA CPU metric, HPA activation, scale-up/down
Memory	Load-generator sample app memory load	`metrics.k8s.io`, HPA memory metric, HPA activation, scale-up/down

Architecture

Each check has its own runner and target. The runner creates the autoscaling signal, validates the metric and HPA path, waits for the target to scale up and back down, and exposes kedify_autoscaling_check_* metrics.

HTTP scaler check:

check-http-runner
  -> Kedify HTTP proxy
  -> check-http-target
  -> KEDA HTTP external metric
  -> KEDA-generated HPA
  -> Deployment replicas
  -> kedify_autoscaling_check_* metrics

Prometheus scaler check:

check-prometheus-runner
  -> check-prometheus-target work requests
  -> work_simulator_inprogress_tasks
  -> Prometheus scrape and query
  -> KEDA Prometheus scaler
  -> KEDA-generated HPA
  -> Deployment replicas
  -> kedify_autoscaling_check_* metrics

CPU resource check:

check-cpu-runner
  -> check-cpu-target CPU profile
  -> metrics.k8s.io CPU samples
  -> KEDA CPU scaler
  -> KEDA-generated HPA
  -> Deployment replicas
  -> kedify_autoscaling_check_* metrics

Memory resource check:

check-memory-runner
  -> check-memory-target memory profile
  -> metrics.k8s.io memory samples
  -> KEDA memory scaler
  -> KEDA-generated HPA
  -> Deployment replicas
  -> kedify_autoscaling_check_* metrics

Prerequisites

Kedify/KEDA is installed in the cluster.
The Kedify HTTP scaler is installed if the HTTP check is enabled.
metrics.k8s.io is available if the CPU or memory check is enabled.
Prometheus is reachable from the check runner Pods if the Prometheus check is enabled.
Prometheus scrapes the bundled work-simulator target if the Prometheus check is enabled.
The Kedify Agent version supports autoscaling-check Service discovery.

Install

Install the OCI chart from GHCR. If the package is still private, authenticate first:

echo "$GITHUB_TOKEN" | helm registry login ghcr.io --username <github-user> --password-stdin

Install the default HTTP and CPU checks:

helm upgrade --install autoscaling-checks oci://ghcr.io/kedify/charts/autoscaling-checks \
  --version <version> \
  --namespace autoscaling-checks \
  --create-namespace

Enable the memory check when you want to cover memory resource scaling too:

helm upgrade --install autoscaling-checks oci://ghcr.io/kedify/charts/autoscaling-checks \
  --version <version> \
  --namespace autoscaling-checks \
  --create-namespace \
  --set memory.enabled=true

If you are testing from a private checkout, use the local chart path and set the image tag explicitly:

helm upgrade --install autoscaling-checks ./charts/autoscaling-checks \
  --namespace autoscaling-checks \
  --create-namespace \
  --set image.repository=ghcr.io/kedify/autoscaling-checks \
  --set image.tag=<version>

Enable or disable checks explicitly:

helm upgrade --install autoscaling-checks oci://ghcr.io/kedify/charts/autoscaling-checks \
  --version <version> \
  --namespace autoscaling-checks \
  --create-namespace \
  --set prometheus.enabled=true \
  --set memory.enabled=true \
  --set cpu.enabled=false

The autoscaling-checks namespace is recommended. Other namespaces work as long as the runner Services keep the chart labels and the Kedify Agent can read Services.

Configure Prometheus

Set the Prometheus HTTP API address and the source query for the Prometheus check:

helm upgrade --install autoscaling-checks oci://ghcr.io/kedify/charts/autoscaling-checks \
  --version <version> \
  --namespace autoscaling-checks \
  --create-namespace \
  --set prometheus.enabled=true \
  --set prometheus.serverAddress=http://kube-prometheus-stack-prometheus.monitoring.svc.cluster.local:9090 \
  --set 'prometheus.sourceQuery=sum(work_simulator_inprogress_tasks)'

If you use Prometheus Operator, enable ServiceMonitors for the runner metrics and the Prometheus target sample app:

helm upgrade --install autoscaling-checks oci://ghcr.io/kedify/charts/autoscaling-checks \
  --version <version> \
  --namespace autoscaling-checks \
  --create-namespace \
  --set monitoring.serviceMonitor.enabled=true \
  --set prometheus.targetServiceMonitor.enabled=true

If your Prometheus uses pod annotations, the chart enables scrape annotations by default with monitoring.scrapeAnnotations=true.

Dashboard Visibility

The Kedify Agent discovers autoscaling-check runner Services automatically. No extra metrics endpoint setting is required.

The runner Services must have these labels:

app.kubernetes.io/name=autoscaling-checks
app.kubernetes.io/part-of=autoscaling-checks
autoscaling-checks.kedify.io/role=runner

The Helm chart sets these labels and exposes a Service port named metrics. The dashboard shows the results in the cluster detail view under the Autoscaling Checks tab.

Alerting

Use kedify_autoscaling_check_duration_seconds for the primary alert. It records the latest complete iteration duration and includes bounded labels for the overall result and each step.

Example Prometheus alert:

groups:
  - name: kedify-autoscaling-checks
    rules:
      - alert: KedifyAutoscalingCheckSlow
        expr: kedify_autoscaling_check_duration_seconds > 300
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: Kedify autoscaling check is slow
          description: Check {{ $labels.check }} took {{ $value }}s in the latest iteration.

Use the step labels and kedify_autoscaling_check_step_duration_seconds to identify whether the delay is in signal generation, source metrics, KEDA metrics, HPA metrics, HPA activation, scale-up, or scale-down.

Verify

Wait for the runner and target Deployments:

kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-http-runner
kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-http-target
kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-cpu-runner
kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-cpu-target

If optional checks are enabled, wait for those runners and targets too:

kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-prometheus-runner
kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-prometheus-target
kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-memory-runner
kubectl -n autoscaling-checks rollout status deploy/autoscaling-checks-check-memory-target

Inspect one runner directly:

kubectl -n autoscaling-checks port-forward svc/autoscaling-checks-check-http-runner 8080:8080
curl -s localhost:8080/healthz
curl -s localhost:8080/metrics

Check the generated autoscaling resources:

kubectl -n autoscaling-checks get scaledobject,hpa,deploy,svc -l app.kubernetes.io/part-of=autoscaling-checks

Troubleshooting

If the dashboard tab is empty, verify that the agent can read Services and that the runner Services have the labels shown above and a metrics port.

If the Prometheus check fails, query Prometheus directly with the configured prometheus.sourceQuery and confirm it returns the expected value during a check run.

If the CPU or memory check fails before scaling up, confirm that metrics.k8s.io returns Pod resource metrics in the autoscaling-checks namespace.

If scale-down takes too long, check the ScaledObject cooldown values and the chart common.scaleDownTimeout setting.

Uninstall

helm uninstall autoscaling-checks --namespace autoscaling-checks