EVENT

PlatformCON23

PlatformCon23 is one of the biggest Platform Engineering events of the year with top DevOps and platform engineering leaders on one virtual stage, for 2 days.Join Rafay at PlatformCon to explore:

  • A blueprint for platforming Kubernetes operations: This talk dives into the challenges, design considerations, tradeoffs, and approaches platform engineering teams can take when building a platform for Kubernetes operations. We’ll discuss how to balance an incredible developer experience with enterprise-wide governance and automation.
  • A blueprint for enabling multi-tenancy for production Kubernetes cluster - Learn about best practices to successfully enable a shared multi-tenant cluster model in your enterprise to save on costs, while still being able to meet your developer self-service and organizational requirements in terms of governance and security.
  • Enable secure self-service access to Kubernetes clusters with Paralus: This talk dives into the challenges and design considerations that platform teams have to take into account to enable secure KubeAPI server access for their users. We’ll discuss how Paralus OSS can make it extremely simple for enterprises to implement a zero-trust model to achieve this.

Rafay's Valued Partnerships:

AI Factory FAQs

Learn how Rafay helps companies go from idle and expensive GPUs to building fully-scaled AI factories to accelerate AI and ML innovations.

Who uses AI factories?

AI factories are used by enterprises, cloud service providers, and sovereign AI clouds that need to scale AI workloads efficiently, maximize GPU utilization, and deliver AI as a production service rather than isolated projects. You can see how Rafay worked with Canadian telecommunications provider Telus in this case study.

What role does Rafay play in AI factories?

Rafay provides the control plane for AI factories, handling orchestration, multi-tenancy, governance, and self-service access to AI infrastructure across cloud, on-prem, and sovereign environments.

Is Rafay an AI factory?

Rafay is not a GPU manufacturer or model provider. Rafay provides an infrastructure orchestration and consumption platform that enables organizations to operate AI factories by turning AI infrastructure into a governed, self-service platform.

Does Rafay support NVIDIA NIMs/NIM?

Yes, Rafay supports NVIDIA NIM (NVIDIA Inference Microservices). NIM is NVIDIA’s proprietary solution for delivering packaged inferencing capabilities. It comes pre-configured with NVIDIA’s in-house models and has been optimized for use with a wide range of open-source models, including Meta’s Llama variants. While NIM is often viewed as an alternative to the open-source kServe package, Rafay’s platform supports both NIM and kServe. This flexibility allows customers to choose their preferred inference endpoint and deploy it effortlessly on GPU instances using the Rafay platform. By supporting multiple inferencing solutions, Rafay enables organizations to leverage the most suitable tools for their specific AI/ML needs while maintaining a consistent and manageable infrastructure.

How is Rafay different from Run.AI?

Run:AI focuses on providing fractional/virtualized GPU consumption and a proprietary scheduler optimized for AI/GenAI workloads, replacing the default Kubernetes scheduler. Rafay, however, provides a more comprehensive platform that manages the full lifecycle of underlying Kubernetes clusters and environments. Rafay offers an out-of-the-box experience to deploy and consume Run:AI on Rafay’s GPU PaaS, while also providing its own GPU virtualization and AI-friendly Kubernetes scheduler for customers preferring a single-vendor solution. Essentially, Rafay can either complement Run:AI’s offerings or provide a standalone solution that covers similar functionalities along with broader infrastructure management capabilities, giving customers flexibility in their AI infrastructure choices.

Does Rafay offer a GPU PaaS?

Yes, Rafay provides infrastructure orchestration and workflow automation for cloud-native (Kubernetes) and AI use cases for enterprises, cloud providers, neoclouds, and Sovereign AI clouds. Rafay helps companies deploy a Platform-as-a-Service (PaaS) experience that supports both CPU-only and GPU-accelerated compute environments. Platform teams can quickly set up and deliver customized self-service experiences for developers and data scientists, typically within days or weeks. This flexible platform allows end-users to easily access the computational resources they need, whether it’s standard CPU processing or more powerful GPU capabilities. Rafay’s solution streamlines the deployment and management of diverse computing environments, making it easier for organizations to support a wide range of applications, from standard software to complex AI/ML projects.

Still have questions?

We're here to help you with any inquiries.

Contact