Skip to content

Multi-cluster scaling with DistributedScaledJob

This guide shows how to use DistributedScaledJob (DSJ) to process queue-based workloads across multiple member clusters.

  • Kedify Agent is installed in your KEDA cluster.
  • Member clusters are already registered in kedify-agent-multicluster-kubeconfigs.
  • kubectl installed and access to:
    • KEDA cluster context
    • each member cluster context

DistributedScaledJob requires:

  • DSJ_ENABLED="true" in kedify-agent
  • KEDA raw metrics gRPC protocol enabled

Enable DSJ controller:

Terminal window
kubectl -n keda set env deploy/kedify-agent DSJ_ENABLED="true"

Enable KEDA raw metrics (Helm values example):

keda:
env:
- name: RAW_METRICS_GRPC_PROTOCOL
value: enabled

Create this ConfigMap in the KEDA cluster (default namespace in this example):

apiVersion: v1
kind: ConfigMap
metadata:
name: dsj-mock-metric
namespace: default
data:
metric-value: "0"
Terminal window
kubectl --context keda-cluster -n default apply -f dsj-mock-metric.yaml

Apply this DistributedScaledJob in the KEDA cluster:

apiVersion: keda.kedify.io/v1alpha1
kind: DistributedScaledJob
metadata:
name: dsj-processor
namespace: default
spec:
memberClusters:
- name: member-1
weight: 2
- name: member-2
weight: 3
clusterScheduling:
strategy: weightedRoundRobin
failoverPolicy:
gracePeriod: 1m
hardTaintDuration: 5m
softTaintDuration: 3m
scaledJobSpec:
pollingInterval: 30
maxReplicaCount: 20
successfulJobsHistoryLimit: 2
failedJobsHistoryLimit: 2
scalingStrategy:
strategy: pendingAware
jobTargetRef:
template:
spec:
restartPolicy: Never
containers:
- name: processor
image: busybox:1.36
command:
- /bin/sh
- -c
- |
echo "processing message";
sleep 10
triggers:
- type: kubernetes-resource
name: cfg
metadata:
resourceKind: ConfigMap
resourceName: dsj-mock-metric
key: metric-value
targetValue: "5"
Terminal window
kubectl --context keda-cluster -n default apply -f distributedscaledjob.yaml

Increase the ConfigMap value above target (5):

Terminal window
kubectl --context keda-cluster -n default patch configmap dsj-mock-metric \
--type merge \
-p '{"data":{"metric-value":"20"}}'

Set it back below target:

Terminal window
kubectl --context keda-cluster -n default patch configmap dsj-mock-metric \
--type merge \
-p '{"data":{"metric-value":"0"}}'

When the value is above target, DSJ creates Jobs and distributes them across member clusters according to weights.

Check DSJ status in the KEDA cluster:

Terminal window
kubectl --context keda-cluster -n default get distributedscaledjob dsj-processor -o yaml

Check created Jobs in each member cluster:

Terminal window
kubectl --context member-1 -n default get jobs
kubectl --context member-2 -n default get jobs

You should observe approximately weighted distribution (member-1:2, member-2:3) over time.

  • No Jobs are created:
    • Verify DSJ_ENABLED="true" on kedify-agent.
    • Verify KEDA has RAW_METRICS_GRPC_PROTOCOL=enabled.
    • Verify memberClusters[].name matches registered member names exactly.
  • Trigger does not fire:
    • Verify ConfigMap exists in the same namespace as DSJ.
    • Verify metric-value is numeric and above targetValue.
  • Jobs stuck in Pending:
    • Check member cluster resources and scheduling constraints.
    • Review DSJ status and clusterScheduling.failoverPolicy behavior.