Goodbye to Ingress NGINX – What Happens Next?
Read Now
In the previous blog, we reviewed the limitations of Kubernetes GPU scheduling. These often result in:
In this post, we’ll look at how a new GA feature in Kubernetes v1.34 — Dynamic Resource Allocation (DRA) — aims to solve these problems and transform GPU scheduling in Kubernetes.
DRA introduces a Kubernetes-native way to request, allocate, and share hardware resources across Pods.
For accelerators like GPUs, DRA allows device vendors and cluster administrators to define device classes (e.g., types of GPUs). Workload owners can then request devices with specific configurations from those classes.
Once requested, Kubernetes handles Pod scheduling, node placement, and device assignment automatically. This eliminates the manual coordination between admins and app operators that exists today.
If you’ve used StorageClass, PersistentVolumeClaim, and PersistentVolume for dynamic storage provisioning, DRA will feel familiar. Here are the core concepts:
Defines a category of devices (e.g., GPUs).
DeviceClass (gpu.nvidia.com) is provided.DeviceClasses for specific configurations.Represents available devices on a node.
ResourceSlices.Think of this as a ticket to specific hardware.
ResourceClaim to request devices from a DeviceClass.Think of this as a blueprint for generating new resource claims.
ResourceClaim automatically when using a template.
Cluster Admins
Note: Some device vendors provide default DeviceClasses out-of-the-box.
When workloads are deployed, Kubernetes performs these steps:
If a workload references a ResourceClaimTemplate, Kubernetes generates a fresh ResourceClaim for each Pod (e.g., every replica).
The scheduler matches ResourceClaims to available devices in ResourceSlices, then places Pods on nodes that can satisfy the claims.
On the selected node, kubelet invokes the DRA driver to attach the allocated devices to the Pod.
Note: In a ResourceClaim, multiple Pods share one device. In a ResourceClaimTemplate, each Pod gets its own device.
Both approaches let Pods request devices, but the behavior differs:
Now, let us review what the declarative YAML spec for ResouceClaim and ResourceClaimTemplate look like with some examples.
In this example, multiple Pods can reference shared-gpu-claim and share the allocated GPU.

In the example below, Kubernetes automatically creates a new ResourceClaim for each replica in the Deployment. Each Pod gets a dedicated GPU.

Today, GPU allocation in Kubernetes often requires manual coordination between cluster admins and workload admins. Workloads fail unless admins carefully match requests with available devices using node selectors — essentially, an anti-pattern that breaks Kubernetes’ declarative scheduling model.
Users should not need to know about node labels, GPU models, or device topology.
DeviceClasses.In the next blog, we’ll walk through how to configure, deploy, and use DRA with NVIDIA GPUs step by step.
.png)
.png)
This blog details the specific features of the Rafay Platform Version 4.0 Which Further Simplifies Kubernetes Management and Accelerates Cloud-Native Operations for Enterprises and Cloud Providers
Read Now
.png)
Read Now