Pod Resource Profiles

Pod Resource Profiles describe the future update of a pod’s resources (CPU, memory) in place without requiring a pod restart. Since it doesn’t change the number of replicas of a workload, but instead works at the pod level by adjusting the resources of the container running in a pod, it functions as a vertical scaler.

In-place Updates

It allows resource adjustments for a container without requiring a container or pod restart. This feature must be enabled for the Kubernetes cluster; otherwise, the patch operation will result in an error state. For more details, consult the InPlacePodVerticalScaling feature gate.

Pod Resource Profile (PRP) CRD

The Kedify agent contains a controller that reconciles PRP (Pod Resource Profiles) and also manages pods annotated with the following:

prp.kedify.io/reconcile: true

Based on the rules specified in the PRP custom resource, the controller either acts immediately or schedules an event for a later time.

Example PRP:

apiVersion: keda.kedify.io/v1alpha1
kind: PodResourceProfile
metadata:
  name: podresourceprofile-sample
spec:
  selector: # use either selector or spec.target
    matchLabels:
      app: nginx
  containerName: nginx # required - container name to update
  paused: false # optional, defaults to false
  priority: 0 # optional, defaults to 0
  trigger:
    # allowed values: (de)activated, container{Ready,Started}, pod{Ready,Scheduled,Running}
    after: containerReady # optional, defaults to containerReady
    delay: 30s # required, examples: 20s, 1m, 90s, 2m30s, 2h
  newResources: # required - new requests and/or limits
    requests:
      memory: 50M
      cpu: 200m

This PodResourceProfile ensures that the container named nginx in a pod matching the specified selector (app=nginx) is updated 30 seconds after it becomes ready. Once the readiness probe passes, the timer starts, and the memory requests will be eventually set to 50 MB and CPU to 200 millicores.

The controller can be enabled or disabled on the Kedify Agent using the environment variable PRP_ENABLED. By default, it is disabled. Additionally, the requirement for annotated pods can be turned off using the PRP_REQUIRES_ANNOTATED_PODS environment variable. However, this may have performance implications since the controller filters out pod events that do not change container or pod readiness status or are not referenced by a PRP resource.

Addressing Pods

In the example above, a common label selector was used. It has the same spec as a Deployment’s selector, so anything that can appear under deployment.spec.selector can be used here as well. Using the selector we can make the PodResourceProfile feature available also for Jobs and CronJobs.

Another way to target pods is by using the target field.

For example:

target:
  kind: deployment
  name: nginx

Using selector and target is mutually exclusive. The allowed kinds for .spec.target field are:

deployment
statefulset
daemonset
scaledobject

It is assumed that the workload is present in the same namespace as the created PRP resource.

Triggers

Allowed values include:

containerReady: (default value) specifies whether the container is currently passing its readiness check. The value will change as readiness probes continue executing. If no readiness probes are specified, this field defaults to true once the container is fully started.
- field: pod.status.containerStatuses.ready
- time: pod.status.containerStatuses.state.running.startedAt
containerStarted: indicates whether the container has completed its postStart lifecycle hook and passed its startup probe. Initialized as false, it becomes true after the startupProbe is considered successful. Resets to false if the container is restarted or if kubelet temporarily loses state. In both cases, startup probes will run again. Always true if no startupProbe is defined, and the container is running and has passed the postStart lifecycle hook. The null value must be treated the same as false.
- field: pod.status.containerStatuses.started
- time: pod.status.containerStatuses.state.running.startedAt
podReady: indicates that the pod can service requests and should be added to the load balancing pools of all matching services.
- field: pod.status.conditions[?(.type=='Ready')].status
- time: pod.status.conditions[?(.type=='Ready')].lastTransitionTime
podScheduled: represents the status of the scheduling process for this pod.
- field: pod.status.conditions[?(.type=='PodScheduled')].status
- time: pod.status.conditions[?(.type=='PodScheduled')].lastTransitionTime
podRunning: indicates that the pod has been bound to a node and all containers have started. At least one container is still running or is being restarted.
- field: pod.status.phase
- time: pod.status.startTime
(de)activated: this value is only allowed when .spec.target.kind is set to scaledobject and it will triggers itself based on the activation status of the associated ScaledObject.

ScaledObject & PodResourceProfile

One of the allowed values for .spec.target.kind is also scaledobject. In this case, the trigger of such PodResourceProfile must be set to either activated or deactivated. This way we can lower the requested resources when workload is idle. It is a similar feature as scale to zero replicas, however in this setup we shrink the pod size (in terms of resources) and not the number of replicas.

Shrink workloads when not active

Quick recap: When does a ScaledObject change state?

deactivated – event rate < targetValue for the configured cooldown period.
activated – event rate ≥ targetValue (10 req/s in the example below).

These state flips are now first‑class PRP triggers.

Example:

apiVersion: keda.kedify.io/v1alpha1
kind: PodResourceProfile
metadata:
  name: nginx-active
spec:
  target:
    kind: scaledobject
    name: nginx
  containerName: nginx
  trigger:
    after: activated
    delay: 0s
  newResources:
    requests:
      memory: 250M
---
apiVersion: keda.kedify.io/v1alpha1
kind: PodResourceProfile
metadata:
  name: nginx-standby
spec:
  target:
    kind: scaledobject
    name: nginx
  containerName: nginx
  trigger:
    after: deactivated
    delay: 5s
  newResources:
    requests:
      memory: 30M
---
kind: ScaledObject
apiVersion: keda.sh/v1alpha1
metadata:
  name: nginx
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx
  minReplicaCount: 1
  maxReplicaCount: 8
  triggers:
    - type: kedify-http
      metadata:
        hosts: www.my-app.com
        service: http-demo-service
        port: "8080"
        scalingMetric: requestRate
        targetValue: "10"

Using nginx-standby PodResourceProfile, we will shrink the memory requests for the last replica (because minReplicaCount is set to 1) to 30 megabytes. The resources will be applied only after 5 seconds. This can help with the cold-start delays that are associated with the scale-to-zero use-case, when the application is not optimized for fast starts.

For requesting more resources for the workload in case of increased traffic, we can use the nginx-active prp. In which case, the ScaledObject nginx first needs to become active (at least 10 requests per seconds), then we set its requested amount of memory to 250 megabytes.

What we have effectively achieved by this setup is more effective allocation of resources in our Kubernetes cluster. When the nginx deployment is not being actively used, it can free some precious allocatable memory and let other workloads to fly. When combined with Karpenter this can boil down to savings for cloud bills.

Ready to Try?

Upgrade to the latest Kedify Agent (>= v0.2.16 for PRP‑ScaledObject support).
Enable the InPlacePodVerticalScaling feature‑gate on your cluster if it’s not already on.
Create your first “‑standby / ‑active” pair and watch your requested GBs plummet.

Have questions or want hands‑on help benchmarking savings? Ping us on Slack or book a 15‑minute chat. We love swapping scaling war stories.

Use-cases

Pod Resource Profiles are useful in scenarios where workloads exhibit predictable resource consumption behavior. Certain application frameworks require a significant amount of memory or CPU during startup for initialization but then need less during steady operation.

Another example could be a job that runs to completion but requires different computational resources at different stages. Instead of allocating the maximum resources for all phases, the PRP can match the workload’s actual utilization profile, allowing for more efficient bin packing by the Kubernetes scheduler.

The current design allows multiple PRP resources to target the same pods. In such cases, matching PRPs are sorted first by priority (.spec.priority), followed by the delay. The PRP with the smallest unapplied delay is selected over one with a higher delay. If multiple PRPs still match, they are sorted alphabetically, with the “smaller” one winning. This enables multiple PRPs to be set up for the same workload, changing resource allocations multiple times throughout the pod’s lifecycle.

Quick Start

# Prepare a cluster with the feature enabled
k3d cluster create dyn-resources --no-lb --k3s-arg "--disable=traefik,servicelb@server:*" \
   --k3s-arg "--kube-apiserver-arg=feature-gates=InPlacePodVerticalScaling=true@server:*"
# Install Kedify agent...

Create a sample deployment with nginx that has one pod. It will have 45MB memory requested.

# Create a deployment
kubectl create deployment nginx --image=nginx

# Wait for it to become ready
kubectl rollout status deploy/nginx

# Set resource requests to observe changes
kubectl set resources deployment nginx --requests=memory=40Mi

# Add the required annotation to the pods
kubectl patch deployments.apps nginx --type=merge -p \
 '{"spec":{"template": {"metadata":{"annotations": {"prp.kedify.io/reconcile": "enabled"}}}}}'

# Verify that in-place patches on resources work
kubectl patch po $(kubectl get po -lapp=nginx -ojsonpath="{.items[0].metadata.name}") --type=json \
        -p '[{"op":"replace","path":"/spec/containers/0/resources/requests/memory","value":"45Mi"}]'
# If the request fails, ensure the Kubernetes cluster has the InPlacePodVerticalScaling feature on

Now, let’s create a crd and after 20 seconds the resources will be changed to 30MB.

# Create a PRP CRD for the controller
cat <<PRP | kubectl apply -f -
apiVersion: keda.kedify.io/v1alpha1
kind: PodResourceProfile
metadata:
  name: nginx
spec:
  selector:
    matchLabels:
      app: nginx
  containerName: nginx
  trigger:
    delay: 20s
  newResources:
    requests:
      memory: 30M
PRP

Finally check if everything works as expected.

# Check the PRP resource
kubectl get prp -owide

# After some time
kubectl get po -lapp=nginx -ojsonpath="{.items[*].spec.containers[?(.name=='nginx')].resources}" | jq