KEDA Best Practices

Specify ScaledObject Fallback

In KEDA, a ScaledObject can be configured to use a fallback mechanism to ensure reliability in scaling operations. The fallback configuration defines default scaling behavior in case the primary metrics are unavailable or misconfigured. This is crucial for maintaining a baseline level of service during failures.

Example Configuration:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: example-scaledobject
spec:
  scaleTargetRef:
    name: my-deployment
  fallback:
    failureThreshold: 3 # Number of consecutive failures to trigger fallback
    replicas: 6 # Number of replicas to scale to during fallback

In this example:

failureThreshold is set to 3, meaning the fallback mechanism will activate after three consecutive failures to retrieve metrics.
replicas is set to 6, ensuring that at least six replicas are maintained during fallback.

This configuration ensures that even if the primary scaling metrics fail, the application will maintain availability and performance by having a specified number of replicas running.

Specify HPA Behavior

The Horizontal Pod Autoscaler (HPA) behavior can be finely tuned to better manage scaling operations, particularly for scaling between 1 and N replicas. For scaling between 0 and 1 replicas, it is important to configure the cooldownPeriod and pollingInterval with the activation setting. For HPA behavior (and thus 1<->N Scaling), key configurations include stabilization windows, scale-up policies, and scale-down policies. These settings help control how quickly the HPA responds to changes in load, preventing rapid fluctuations and ensuring smoother scaling transitions.

The HPA behavior should be configured directly in the ScaledObject, for more details refer to the documentation.

Example Configuration:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: my-scaledobject
spec:
  scaleTargetRef:
    name: my-deployment
  advanced:
    horizontalPodAutoscalerConfig:
      behavior:
        scaleUp:
          stabilizationWindowSeconds: 300 # Stabilization window to prevent flapping
          policies:
            - type: Percent
              value: 100
              periodSeconds: 60
        scaleDown:
          stabilizationWindowSeconds: 300 # Stabilization window to prevent flapping
          policies:
            - type: Percent
              value: 50
              periodSeconds: 60
  triggers:
    - type: rabbitmq
      metadata:
        queueName: my-queue
        host: rabbitmq.my-cluster.com
        queueLength: '5'

In this example:

scaleUp and scaleDown behaviors include a stabilizationWindowSeconds of 300 seconds to prevent rapid fluctuations (flapping) in the number of replicas.
Policies are set to scale by a percentage (Percent) of the current number of pods. For scaling up, it can increase by up to 100% of the current pods every 60 seconds. For scaling down, it can decrease by up to 50% every 60 seconds.

Scaling Modifiers

Scaling modifiers in KEDA provide a sophisticated method to fine-tune autoscaling behaviors by creating a composite-metric for the Horizontal Pod Autoscaler (HPA). This allows for more complex scaling logic beyond simple metric evaluation, enabling custom scaling decisions based on a variety of metrics and conditions.

Example Configuration with Scaling Modifiers:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: rabbitmq-composite-scaledobject
spec:
  scaleTargetRef:
    name: my-rabbitmq-consumer
  advanced:
    scalingModifiers:
      formula: '(metric1 + metric2) / 2'
      target: 5
      activationTarget: 3
      metricType: AverageValue
  triggers:
    - type: rabbitmq
      name: metric1
      metadata:
        queueName: queue1
        host: rabbitmq-host1
        queueLength: '20'
    - type: rabbitmq
      name: metric2
      metadata:
        queueName: queue2
        host: rabbitmq-host2
        queueLength: '25'

In this example:

Two RabbitMQ triggers (metric1 and metric2) are defined, each monitoring different queues.
The formula calculates the average queue length by combining metric1 and metric2 and dividing by 2.
target is set to 5, meaning the desired average queue length is 5.
activationTarget is set to 3, which is the threshold for activating the ScaledObject.
metricType is AverageValue, indicating the use of the average value of the metrics.

By using scaling modifiers, you can create more nuanced and responsive scaling strategies, ensuring your applications scale intelligently based on a composite of multiple metrics.