Operationalizing AI Fabrics with Aviz ONES, NVIDIA Spectrum-X, and Rafay
Discover the new AI operations model available to enterprises that enables self-service consumption and cloud-native orchestration for developers.
Cloud-native technologies are fast becoming an integral part of IT environments as organizations continuously accelerate their development efforts to meet business demands. Cloud-Native Days 2021 explores the cloud-native ecosystem beyond Kubernetes and ways in which organizations can leverage cloud-native technologies to move faster and more securely.Sessions will be geared toward practitioners, managers and C-level executives, and led by industry thought leaders and doers in the cloud-native space. Attendees will walk away with a better understanding of cloud-native and its impact on the IT landscape.Mohan Atreya - Rafay Systems' SVP of Products and Services presented the following:Streamlining Amazon EKS Operations & ManagementKubernetes is the de facto container orchestration tool, and Amazon AWS is the leading cloud platform. But when you have to rapidly scale your Kubernetes deployments in AWS, you may find that Amazon EKS demands skills and headcount that your organization doesn’t have. View this recording to learn:
.webp)









Learn how Rafay helps companies go from idle and expensive GPUs to building fully-scaled AI factories to accelerate AI and ML innovations.
AI factories are used by enterprises, cloud service providers, and sovereign AI clouds that need to scale AI workloads efficiently, maximize GPU utilization, and deliver AI as a production service rather than isolated projects. You can see how Rafay worked with Canadian telecommunications provider Telus in this case study.
Rafay provides the control plane for AI factories, handling orchestration, multi-tenancy, governance, and self-service access to AI infrastructure across cloud, on-prem, and sovereign environments.
Rafay is not a GPU manufacturer or model provider. Rafay provides an infrastructure orchestration and consumption platform that enables organizations to operate AI factories by turning AI infrastructure into a governed, self-service platform.
Yes, Rafay supports NVIDIA NIM (NVIDIA Inference Microservices). NIM is NVIDIA’s proprietary solution for delivering packaged inferencing capabilities. It comes pre-configured with NVIDIA’s in-house models and has been optimized for use with a wide range of open-source models, including Meta’s Llama variants. While NIM is often viewed as an alternative to the open-source kServe package, Rafay’s platform supports both NIM and kServe. This flexibility allows customers to choose their preferred inference endpoint and deploy it effortlessly on GPU instances using the Rafay platform. By supporting multiple inferencing solutions, Rafay enables organizations to leverage the most suitable tools for their specific AI/ML needs while maintaining a consistent and manageable infrastructure.
Run:AI focuses on providing fractional/virtualized GPU consumption and a proprietary scheduler optimized for AI/GenAI workloads, replacing the default Kubernetes scheduler. Rafay, however, provides a more comprehensive platform that manages the full lifecycle of underlying Kubernetes clusters and environments. Rafay offers an out-of-the-box experience to deploy and consume Run:AI on Rafay’s GPU PaaS, while also providing its own GPU virtualization and AI-friendly Kubernetes scheduler for customers preferring a single-vendor solution. Essentially, Rafay can either complement Run:AI’s offerings or provide a standalone solution that covers similar functionalities along with broader infrastructure management capabilities, giving customers flexibility in their AI infrastructure choices.
Yes, Rafay provides infrastructure orchestration and workflow automation for cloud-native (Kubernetes) and AI use cases for enterprises, cloud providers, neoclouds, and Sovereign AI clouds. Rafay helps companies deploy a Platform-as-a-Service (PaaS) experience that supports both CPU-only and GPU-accelerated compute environments. Platform teams can quickly set up and deliver customized self-service experiences for developers and data scientists, typically within days or weeks. This flexible platform allows end-users to easily access the computational resources they need, whether it’s standard CPU processing or more powerful GPU capabilities. Rafay’s solution streamlines the deployment and management of diverse computing environments, making it easier for organizations to support a wide range of applications, from standard software to complex AI/ML projects.