In this blog, we take the next step toward a complete billing workflow—automatically transforming usage into billable cost using SKU-specific pricing.
Read Now
The community Ingress NGINX project is entering end-of-life in March 2026. Discover what this means for Kubernetes users and why you’ll need to migrate, what alternatives exist (Gateway API, Traefik, etc.), and how to plan your transition smoothly with minimal disruption.
In this second blog, we installed a Kuberneres v1.34 cluster and deployed an example DRA driver on it with "simulated GPUs". In this blog, we’ll will deploy a few workloads on the DRA enabled Kubernetes cluster to understand how "Resource Claim" and "ResourceClaimTemplates" work.
Cloud providers offering GPU or Neo Cloud services need accurate and automated mechanisms to track resource consumption.
Whether you’re training deep learning models, running simulations, or just curious about your GPU’s performance, nvidia-smi is your go-to command-line tool.
In this post, we’ll look at how a new GA feature in Kubernetes v1.34 — Dynamic Resource Allocation (DRA) — aims to solve these problems and transform GPU scheduling in Kubernetes.
Kubernetes has cemented its position as the de-facto standard for orchestrating containerized workloads in the enterprise.
In this blog, we will describe how Rafay Zero Trust Kubectl Access Proxy gives Argo CD a secure path to every cluster in the fleet, even when those clusters sit deep behind corporate firewalls.
Learn about Rafay's approach to drift prevention and detection in this blog.
ArgoCD is a powerful GitOps controller for Kubernetes, enabling declarative configuration and automated synchronization of workloads.
As demand for GPU-accelerated workloads soars across industries, cloud providers are under increasing pressure to offer flexible, cost-efficient, and isolated access to GPUs.
As GPU acceleration becomes central to modern AI/ML workloads, Kubernetes has emerged as the orchestration platform of choice.
In the modern era of containerized machine learning and AI infrastructure, GPUs are a critical and expensive asset.
Artificial Intelligence (AI) has moved far beyond simple chat bots and rigid automation. At the frontier of this evolution lies a powerful new paradigm : AI Agents.
In multi-tenant GPU cloud environments, effective resource management is critical to ensure fair usage and prevent contention.
If you’re running Kubernetes workloads on Amazon EKS backed by Intel-based instances, you’re leaving significant savings on the table.
Project Slinky and Rafay’s GPU Platform-as-a-Service (PaaS) combined provide enterprises and cloud providers with a transformative combination that enables secure, multi-tenant, self-service access to Slurm-based HPC environments on shared Kubernetes clusters.
As high-performance computing (HPC) environments evolve, there’s an increasing demand to bridge the gap between traditional HPC job schedulers and modern cloud-native infrastructure.
In Kubernetes, exposing services of type LoadBalancer in on-prem or bare-metal environments typically requires a dedicated “Layer 2” or “BGP-based” software load balancer—such as MetalLB.
In the fast-evolving world of GPU cloud services and AI infrastructure, accurate, flexible, and real-time billing is no longer optional — it’s mission critical.
Enterprises are increasingly leveraging Amazon SageMaker AI to empower their data science teams with scalable, managed machine learning (ML) infrastructure.
In this step-by-step guide, the Bioinformatics data scientist will use Rafay’s end user portal to launch a well resourced remote VM and run a series of BioContainers with Docker.
In today’s fast-paced world of bioinformatics, the constant evolution of tools, dependencies, and operating system environments presents a significant challenge.
Enterprises are turning to AI/ML to solve new problems and simplify their operations, but running AI in the datacenter often compromises performance. Edge inference moves workloads closer to users, enabling low-latency experiences with fewer overheads, but it's traditionally cumbersome to manage GPUs (Graphics Processing Units) in distributed infrastructure.