
Choosing the Right Fractional GPU Strategy for Cloud Providers
As demand for GPU-accelerated workloads soars across industries, cloud providers are under increasing pressure to offer flexible, cost-efficient, and isolated access... Read more.

Demystifying Fractional GPUs in Kubernetes: MIG, Time Slicing, and Custom Schedulers
As GPU acceleration becomes central to modern AI/ML workloads, Kubernetes has emerged as the orchestration platform of choice. However, allocating full GPUs for... Read more.

Custom GPU Resource Classes in Kubernetes
In the modern era of containerized machine learning and AI infrastructure, GPUs are a critical and expensive asset. Kubernetes makes scheduling and isolation easier—but... Read more.

The Rise of AI Agents: From Zero to Production
Artificial Intelligence (AI) has moved far beyond simple chat bots and rigid automation. At the frontier of this evolution lies a powerful new paradigm : AI Agents.... Read more.

Configure and Manage GPU Resource Quotas in Multi-Tenant Clouds
In multi-tenant GPU cloud environments, effective resource management is critical to ensure fair usage and prevent contention. GPU resource quotas allow organizations... Read more.

Slash EKS Cluster Costs by 20-30% Instantly with AWS Graviton
If you’re running Kubernetes workloads on Amazon EKS backed by Intel-based instances, you’re leaving significant savings on the table. In this blog, we will... Read more.

Introduction to Slurm-The Backbone of HPC
This is part-1 in a blog series on Slurm. In the first part, we will provide some introductory concepts about Slurm. We are not talking about the fictional soft... Read more.

Self-Service Slurm Clusters on Kubernetes with Rafay GPU PaaS
In the previous blog, we discussed how Project Slinky bridges the gap between Slurm, the de facto job scheduler in HPC, and Kubernetes, the standard for modern... Read more.

Project Slinky: Bringing Slurm Scheduling to Kubernetes
As high-performance computing (HPC) environments evolve, there’s an increasing demand to bridge the gap between traditional HPC job schedulers and modern cloud-native... Read more.

Using Cilium as a Kubernetes Load Balancer: A Powerful Alternative to MetalLB
In Kubernetes, exposing services of type LoadBalancer in on-prem or bare-metal environments typically requires a dedicated “Layer 2” or “BGP-based”... Read more.