Skip to content

Multi-Cluster Scaling

Kedify supports scaling workloads across a fleet of Kubernetes clusters. This is achieved through two new custom resources that extend the standard KEDA resources with multi-cluster capabilities:

There are two main types of clusters involved in multi-cluster scaling:

  1. KEDA Cluster: This cluster runs the Kedify stack and manages the scaling logic. It monitors the metrics and decides when to scale workloads up or down.
  2. Member Clusters: These clusters host the actual workloads that need to be scaled. They expose their kube-apiserver to the KEDA cluster for management.

The member clusters don’t need to run KEDA themselves, as scaling decisions for DistributedScaledObject or DistributedScaledJob are made by the KEDA cluster. This allows for a smaller footprint on member clusters and enables edge scenarios where resources are limited.

In order to connect a member cluster to the KEDA cluster, you need to make the kube-apiserver of the member cluster accessible from the KEDA cluster. This can be done using various methods such as VPN, VPC peering, or exposing the API server via a load balancer with proper security measures.

With the connectivity established, you can use Kedify’s kubectl plugin to register member clusters to the KEDA cluster:

Terminal window
kubectl kedify mc setup-member <name> --keda-kubeconfig <path> --member-kubeconfig <path>

This command will use the provided kubeconfig files to set up the necessary access and permissions for the KEDA cluster to manage the member cluster. The member-kubeconfig should have sufficient permissions to create RBAC, ServiceAccount and keda namespace in the member cluster, these resources will be created with minimal privileges required for Kedify multi-cluster to operate. The keda-kubeconfig should have permissions to patch Secret named kedify-agent-multicluster-kubeconfigs in keda namespace in the KEDA cluster. For connecting multiple member clusters, you can repeat the above command with different names and kubeconfig files for each member cluster.

In case you would like the KEDA cluster to connect to the member cluster using a different address than the one specified in the member-kubeconfig, you can provide --member-api-url <url> flag to override the API server URL.

You can also list and remove registered member clusters using the following commands:

Terminal window
kubectl kedify mc list-members
kubectl kedify mc delete-member <name>

A ScaledObject is a KEDA resource that defines how to scale a specific workload based on certain metrics. The DistributedScaledObject extends this concept to support scaling across multiple clusters. It includes all the fields of a standard ScaledObject, along with additional fields to specify the member clusters and their configurations.

Distributed ScaledObject Architecture

Here is an example of a DistributedScaledObject:

apiVersion: keda.kedify.io/v1alpha1
kind: DistributedScaledObject
metadata:
name: nginx
spec:
memberClusters: # optional list of member clusters to use, if omitted all registered member clusters will be used
- name: member-cluster-1
weight: 4 # weight determines the proportion of replicas to be allocated to this cluster
- name: member-cluster-2
weight: 6
rebalancingPolicy: # optional parameters for rebalancing replicas across member clusters in case of outage or issues
gracePeriod: 1m # when a member cluster becomes unreachable, wait for this duration before rebalancing replicas to other clusters
scaledObjectSpec: # standard ScaledObject spec
scaleTargetRef:
kind: Deployment
name: nginx
minReplicaCount: 1
maxReplicaCount: 10
triggers:
- type: kubernetes-resource
metadata:
resourceKind: ConfigMap
resourceName: mock-metric
key: metric-value
targetValue: "5"

In this example, the DistributedScaledObject named nginx is configured to scale a Deployment named nginx across two member clusters. The memberClusters field whitelists the member clusters to be used along with their respective weights, which determine how many replicas should be allocated to each cluster. This section is optional; if omitted, all registered member clusters will be used with equal weights.

The workloads of type Deployment are expected to be present in relevant member clusters in a matching namespace as the DistributedScaledObject.

The rebalancingPolicy field allows you to specify how to handle situations where a member cluster becomes unreachable. In this case, after the specified gracePeriod, the replicas that were allocated to the unreachable cluster will be redistributed among the remaining healthy clusters. Once the unreachable cluster becomes healthy again, the replicas will be rebalanced back according to the defined weights.

Status of the DistributedScaledObject provides insights into the scaling state across member clusters:

status:
memberClusterStatuses:
member-cluster-1:
currentReplicas: 2
description: Cluster is healthy
desiredReplicas: 2
id: /etc/mc/kubeconfigs/member-cluster-1.kubeconfig+kedify-agent@member-cluster-1
lastStatusChangeTime: "2025-11-05T16:46:39Z"
state: Ready
member-cluster-2:
currentReplicas: 3
description: Cluster is healthy
desiredReplicas: 3
id: /etc/mc/kubeconfigs/member-cluster-2.kubeconfig+kedify-agent@member-cluster-2
lastStatusChangeTime: "2025-11-05T15:45:44Z"
state: Ready
membersHealthyCount: 2
membersTotalCount: 2
selector: kedify-agent-distributedscaledobject=nginx
totalCurrentReplicas: 5

Similar to DistributedScaledObject, the DistributedScaledJob extends KEDA’s ScaledJob concept to support job-based workloads across multiple clusters. It includes all the fields of a standard ScaledJob, along with additional fields to specify the member clusters and their configurations.

Distributed ScaledJob Architecture

Before using DistributedScaledJobs, make sure that the raw metrics endpoint in KEDA is enabled. Use the environment variable RAW_METRICS_GRPC_PROTOCOL and set it to the value enabled. In the values.yaml file:

keda:
env:
- name: RAW_METRICS_GRPC_PROTOCOL
value: enabled

From the command line - add this argument to your helm installation command:

helm install ... \
--set-json 'keda.env=[{"name":"RAW_METRICS_GRPC_PROTOCOL","value":"enabled"}]'

The following example of a DistributedScaledJob splits the execution of job processing from the RabbitMQ task queue between two member clusters in a 2:3 ratio.

apiVersion: keda.kedify.io/v1alpha1
kind: DistributedScaledJob
metadata:
name: processor-job
spec:
failedJobsHistoryLimit: 2 # keep up to 2 failed jobs, delete all older
successfulJobsHistoryLimit: 2 # keep up to 2 jobs that completed successfully, delete all older
memberClusters: # optional list of member clusters to use, if omitted all registered member clusters will be used
- name: member-cluster-1
weight: 2 # weight determines the proportion of jobs to be allocated to this cluster
- name: member-cluster-2
weight: 3
rebalancingPolicy:
gracePeriod: 3m # duration after which a pending pod is marked as stuck and moved to a different cluster
failingClusterQuarantineDuration: 1m # duration to quarantine a failing cluster before attempting to scale jobs on it again
scaledJobSpec: # standard ScaledJob spec
jobTargetRef:
template:
spec:
containers:
- name: processor
image: myapp:latest
command: ["process"]
restartPolicy: Never
pollingInterval: 30
maxReplicaCount: 20
scalingStrategy:
strategy: pendingAware # pending/stuck jobs should be duplicated on a different cluster
triggers:
- type: rabbitmq
metadata:
queueName: tasks
host: http://guest:password@localhost:15672/path/vhost
value: "5"

In this example, the DistributedScaledJob named processor-job is configured to scale Jobs across two member clusters. The key differences from DistributedScaledObject:

  • Job-based workload: Instead of scaling Deployments, it creates and manages Jobs based on metrics

The jobTargetRef field contains the standard Kubernetes Job template specification. Jobs are created in the member clusters based on the scaling metrics and cluster weights.

Status of the DistributedScaledJob provides insights into the job state across member clusters:

status:
desiredJobs: 10
runningJobs: 8
pendingJobs: 2
memberClusterStatuses:
member-cluster-1:
currentReplicas: 3
description: Cluster is healthy
desiredReplicas: 4
id: /etc/mc/kubeconfigs/member-cluster-1.kubeconfig+kedify-agent@member-cluster-1
lastStatusChangeTime: "2025-11-05T16:46:39Z"
state: Ready
excluded: false
member-cluster-2:
currentReplicas: 5
description: Cluster is healthy
desiredReplicas: 6
id: /etc/mc/kubeconfigs/member-cluster-2.kubeconfig+kedify-agent@member-cluster-2
lastStatusChangeTime: "2025-11-05T15:45:44Z"
state: Ready
excluded: false
selector: kedify-agent-distributedscaledjob=processor-job

For a walkthrough example on how to set up and use multi-cluster scaling with Kedify, refer to the examples repository.

Scaling strategies are used to compute how many new Jobs to create across clusters. Choose one of: basic, pendingAware, custom, accurate, eager.

Inputs:

  • desiredJobsCount: target number derived from metrics and DSJ min/max bounds
  • runningJobsCount: number of non-terminal Jobs currently present (includes “pending”)
  • pendingJobsCount: subset of running Jobs considered “pending” (not yet progressed)
  • maxReplicaCount: DSJ upper bound on total concurrent non-terminal Jobs

Scale to the gap between desired and running.

  • Formula: desired - running
  • Behavior: simple “catch up” to desired.

Immediately re-create pending Jobs on other clusters while honoring capacity.

  • Idea: replace stuck Jobs.
  • Formula:
    • needed = max(0, desired - running + pending)
    • capacity = max(0, maxReplica - running)
    • scaleTo = min(needed, capacity)
  • Use when pending/stuck Jobs should be duplicated elsewhere quickly.

User-defined scaling using a percentage of running Jobs and optional queue deduction.

  • Inputs: runningJobPercentage (float), queueLengthDeduction (int)
  • Formula: scaleTo = min(desired - deduction - running * percentage, maxReplica)
  • Notes:
    • If percentage parse fails, falls back to Basic.

Balance towards desired while staying within capacity; subtract pending from desired unless over max.

  • Formula:
    • If desired + running > maxReplica: scaleTo = maxReplica - running
    • Else: scaleTo = desired - pending
  • Use when pending work should defer new creations and capacity must be respected.

Fill available capacity (excluding pending) up to desired.

  • Formula: scaleTo = min(maxReplica - running - pending, desired)
  • Use when it’s safe to aggressively utilize capacity.
  • Pending-Aware (default): prioritize re-creating stuck Jobs elsewhere.
  • Basic: simplest gap-based scaling.
  • Accurate: conservative, subtracts pending.
  • Eager: aggressive, fills capacity quickly.
  • Custom: tailor behavior with percentage and deductions.