Author

Mohan Atreya

How GPU Clouds Deliver NVIDIA Run:ai as Self-Service with Rafay GPU PaaS

Learn how Rafay GPU PaaS enables GPU Clouds to offer NVIDIA Run:ai as a fully automated, multi-tenant managed service delivered through self-service with lifecycle management and turnkey deployment.

Read Now

Product

GPU Cloud Billing: From Usage Metering to Billing

In this blog, we take the next step toward a complete billing workflow—automatically transforming usage into billable cost using SKU-specific pricing.

Read Now

Product

Goodbye to Ingress NGINX – What Happens Next?

The community Ingress NGINX project is entering end-of-life in March 2026. Discover what this means for Kubernetes users and why you’ll need to migrate, what alternatives exist (Gateway API, Traefik, etc.), and how to plan your transition smoothly with minimal disruption.

Read Now

Product

Deploy Workload using DRA ResourceClaim in Kubernetes

In this second blog, we installed a Kuberneres v1.34 cluster and deployed an example DRA driver on it with "simulated GPUs". In this blog, we’ll will deploy a few workloads on the DRA enabled Kubernetes cluster to understand how "Resource Claim" and "ResourceClaimTemplates" work.

Read Now

GPU/Neocloud Billing using Rafay’s Usage Metering APIs

Cloud providers offering GPU or Neo Cloud services need accurate and automated mechanisms to track resource consumption.

Read Now

Deep Dive into nvidia-smi: Monitoring Your NVIDIA GPU with Real Examples

Whether you’re training deep learning models, running simulations, or just curious about your GPU’s performance, nvidia-smi is your go-to command-line tool.

Read Now

Introduction to Dynamic Resource Allocation (DRA) in Kubernetes

In this post, we’ll look at how a new GA feature in Kubernetes v1.34 — Dynamic Resource Allocation (DRA) — aims to solve these problems and transform GPU scheduling in Kubernetes.

Read Now

Rethinking GPU Allocation in Kubernetes

Kubernetes has cemented its position as the de-facto standard for orchestrating containerized workloads in the enterprise.

Read Now

Product

GitOps Without Borders: Running Argo CD Across Isolated Security Domains with Rafay’s Zero-Trust Kubectl

In this blog, we will describe how Rafay Zero Trust Kubectl Access Proxy gives Argo CD a secure path to every cluster in the fleet, even when those clusters sit deep behind corporate firewalls.

Read Now

Product

Drift Prevention vs Detection: Does a Polling Approach make sense At Scale?

Learn about Rafay's approach to drift prevention and detection in this blog.

Read Now

ArgoCD Reconciliation Explained: How It Works and Why It Matters

ArgoCD is a powerful GitOps controller for Kubernetes, enabling declarative configuration and automated synchronization of workloads.

Read Now

Choosing the Right Fractional GPU Strategy for Cloud Providers

As demand for GPU-accelerated workloads soars across industries, cloud providers are under increasing pressure to offer flexible, cost-efficient, and isolated access to GPUs.

Read Now

Demystifying Fractional GPUs in Kubernetes: MIG, Time Slicing, and Custom Schedulers

As GPU acceleration becomes central to modern AI/ML workloads, Kubernetes has emerged as the orchestration platform of choice.

Read Now

Custom GPU Resource Classes in Kubernetes

In the modern era of containerized machine learning and AI infrastructure, GPUs are a critical and expensive asset.

Read Now

What Are AI Agents and How Do They Work?

Artificial Intelligence (AI) has moved far beyond simple chat bots and rigid automation. At the frontier of this evolution lies a powerful new paradigm : AI Agents.

Read Now

Configure and Manage GPU Resource Quotas in Multi-Tenant Clouds

In multi-tenant GPU cloud environments, effective resource management is critical to ensure fair usage and prevent contention.

Read Now

Slash EKS Cluster Costs by 20-30% Instantly with AWS Graviton

If you’re running Kubernetes workloads on Amazon EKS backed by Intel-based instances, you’re leaving significant savings on the table.

Read Now

Self-Service Slurm Clusters on Kubernetes with Rafay GPU PaaS

Project Slinky and Rafay’s GPU Platform-as-a-Service (PaaS) combined provide enterprises and cloud providers with a transformative combination that enables secure, multi-tenant, self-service access to Slurm-based HPC environments on shared Kubernetes clusters.

Read Now

Project Slinky: Bringing Slurm Scheduling to Kubernetes

As high-performance computing (HPC) environments evolve, there’s an increasing demand to bridge the gap between traditional HPC job schedulers and modern cloud-native infrastructure.

Read Now

Using Cilium as a Kubernetes Load Balancer: A Powerful Alternative to MetalLB

In Kubernetes, exposing services of type LoadBalancer in on-prem or bare-metal environments typically requires a dedicated “Layer 2” or “BGP-based” software load balancer—such as MetalLB.

Read Now

Powering GPU Cloud Billing: Rafay + Monetize360 Integration

In the fast-evolving world of GPU cloud services and AI infrastructure, accurate, flexible, and real-time billing is no longer optional — it’s mission critical.

Read Now

Cost Management for SageMaker AI: The Case for Strong Administrative Guardrails

Enterprises are increasingly leveraging Amazon SageMaker AI to empower their data science teams with scalable, managed machine learning (ML) infrastructure.

Read Now

Product

Get Started with BioContainers using Rafay

In this step-by-step guide, the Bioinformatics data scientist will use Rafay’s end user portal to launch a well resourced remote VM and run a series of BioContainers with Docker.

Read Now

BioContainers: Streamlining Bioinformatics with the Power of Portability

In today’s fast-paced world of bioinformatics, the constant evolution of tools, dependencies, and operating system environments presents a significant challenge.

Read Now

Trusted by leading enterprises, neoclouds and service providers