BLOG

How GPU Clouds Deliver NVIDIA Run:ai as Self-Service with Rafay GPU PaaS

January 22, 2026

As the demand for AI training and inference surges, GPU Clouds are increasingly looking to offer their users higher-level, turnkey AI services, not just raw GPU instances. Some customers may be familiar with NVIDIA Run:ai as an AI workload and GPU orchestration platform.

Delivering NVIDIA Run:ai as a scalable, repeatable managed service—something customers can select and provision with a few clicks—requires deep automation, lifecycle management, and tenant isolation capabilities. This is exactly what Rafay provides.

With Rafay, GPU Clouds, including NVIDIA Cloud Partners,  can deliver NVIDIA Run:ai as a managed service with self-service provisioning, ensuring customers receive a fully configured NVIDIA Run:ai environment automatically, complete with GPU infrastructure, a Kubernetes cluster, necessary operators, and a ready-to-use NVIDIA Run:ai tenant. This post explains how Rafay enables cloud providers to industrialize NVIDIA Run:ai provisioning into a consistent, production-ready managed service.

For GPU Clouds, managed services with self-serve provisioning offer tremendous benefits:

  1. Predictable, standardized offerings for customers
  2. Reduced complexity, since the managed service layer abstracts  underlying infrastructure
  3. Faster onboarding, enabling customers to begin using NVIDIA Run:ai in minutes
  4. Higher margins, by offering value-added services instead of raw compute
  5. Scalability, allowing dozens or hundreds of customers/tenants to onboard seamlessly

In short, transforming NVIDIA Run:ai into a cloud-managed service allows GPU Clouds to deliver value added services in a scalable way. The experience begins in the GPU Cloud provider’s marketplace or self-service portal. Customers simply choose the NVIDIA Run:ai service, which can supports variations like:

  • NVIDIA Run:ai Standard — 4 GPUs (e.g., L40S or A100)
  • NVIDIA Run:ai Enterprise — 8 GPUs (e.g., H100)
  • Multi-node NVIDIA Run:ai deployment (e.g., 2× H100 nodes with 16 GPUs total)

Each service tier is configured in the Rafay platform by the cloud provider administrator. They decide what options they would like to expose to their customer.

Seamless Orchestration under the Covers

Once the user selects deploy, Rafay will orchestrate required infrastructure, deploy and configure software dependencies and finally NVIDIA Run:ai software. The sequence diagram below provides additional context to what happens at each step.

Once deployment is complete, Rafay presents the user with:

  1. NVIDIA Run:ai Administrative Portal URL
  2. NVIDIA Run:ai tenant administrator credentials

Users now have a complete NVIDIA Run:ai deployment delivered through a single self-serve request. The NVIDIA Run:ai administrator can add end users via the console, and begin scheduling workloads on available GPU resources.

Infrastructure Automation

Rafay automatically provisions the desired GPU infrastructure in the GPU Cloud's datacenter to:

  • Provision physical GPU servers or GPU-enabled VMs
  • Configures networking, storage, security groups, and VPC isolation
  • Provisions a production-grade Kubernetes cluster (e.g. Rafay MKS including the control plane and worker nodes)
  • Deploys and configures cluster add-ons, monitoring, logging, and observability components

NVIDIA Run:ai Tenant Automation

To truly deliver NVIDIA Run:a as a managed service, creation of the associated tenant and integration with the control plane must be automated. Rafay handles the end-to-end workload by:

  • Creating an NVIDIA Run:ai tenant via API
  • Registering the newly provisioned Kubernetes cluster
  • Verifying successful NVIDIA Run:ai operator deployment and onboarding
  • Ensuring tenant-level isolation of the environment

In a nutshell, customers receive a dedicated NVIDIA Run:ai environment, without ever needing to touch infrastructure.

Conclusion

Rafay transforms NVIDIA Run:ai from a manually deployed platform into a self-service, cloud-managed service that GPU Cloud providers can deliver with confidence. By automating everything, from GPU infrastructure provisioning to tenant creation and cluster onboarding, Rafay ensures customers can begin using the NVIDIA Run:ai service within minutes.

Customers gain instant access to NVIDIA Run:ai, while cloud operators achieve:

  • Higher operational efficiency
  • Scalable onboarding of new customers
  • Stronger differentiation in the GPU Cloud market
  • A future-proof platform for expanding GPU-accelerated services

In support of this effort, we’re pleased to announce that Rafay is certified with NVIDIA Run:ai. As GPU Clouds look to deliver differentiated AI services at scale, automation across infrastructure provisioning, configuration, and lifecycle management is essential.

Rafay provides an enterprise-grade AI infrastructure platform that enables GPU Clouds to deliver production-ready AI services—simplifying operations, ensuring consistency, and accelerating innovation for customers.

Share this post

Want a deeper dive in the Rafay Platform?

Book time with an expert.

Book a demo
Tags:

You might be also be interested in...

Product

How GPU Clouds Deliver NVIDIA Run:ai as Self-Service with Rafay GPU PaaS

Learn how Rafay GPU PaaS enables GPU Clouds to offer NVIDIA Run:ai as a fully automated, multi-tenant managed service delivered through self-service with lifecycle management and turnkey deployment.

Read Now

Deep Dive into nvidia-smi: Monitoring Your NVIDIA GPU with Real Examples

Whether you’re training deep learning models, running simulations, or just curious about your GPU’s performance, nvidia-smi is your go-to command-line tool.

Read Now

Simplifying AI Workload Delivery for Platform Teams in 2025

AI workloads are growing more complex by the day, and platform teams are under immense pressure to deliver them at scale—securely, efficiently, and with speed.

Read Now