Infrastructure Automation for Generative AI

Automate the Infrastructure that drives your Company’s AI Journey

Rafay has helped dozens of enterprises accelerate their modernization and AI initiatives. Come build with us.

Rafay provides a modern solution that helps Guardant Health be prepared for the future.

William Baird, Manager of Infrastructure Engineering

Guardant Health

Bring your State-of-the-Art AI Applications to Market Faster

Rafay’s ready-made templates for Generative AI use cases speed up the enterprise AI journey


Sentiment analysis, chatbots, automated help, and text classification


Fraud detection, predictive analysis, and 360-degree customer sentiment analysis

Image Analysis

Object detection & classification, OCR, Healthcare imaging, and defect identification

Audio Analysis

Speech recognition, audio signature detection, and voice generation

Key Requirements for AI Infrastructure Automation

To support AI adoption at enterprise scale, top performing companies solve for the following key requirements:

Autonomy for Developers & Data Scientists
  1. Self-service creation of, and access to, cloud infrastructure for AI applications
  2. Pre-defined golden-path workflows for AI applications and underlying infrastructure, including landing zones and functioning application code
  3. Pre-built templates for consumption of public and private LLMs, e.g. Amazon Bedrock and ChatGPT3.5
  4. Self-service access to monitoring and troubleshooting including GPU usage
Control & Efficiency for Platform Teams
  1. Provide AI infrastructure-as-a-service for developers and data scientists
  2. Centralized management of RBAC integrated with enterprise SSO
  3. Pre-test, integrate and manage Kubernetes software add-ons
  4. Multi-tenancy with isolation by user, application, label, etc.
  5. Chargeback & showback FinOps reporting governed by multi-tenancy
  6. OPA & network policy definition and application via blueprints & templates
  7. Cloud and Kubernetes cluster provisioning and fleet operations
  8. Standardized environment & Kubernetes templates
  9. Provide dashboard & performance monitoring governed by multi-tenancy
  10. Pre-built integrations with Amazon Bedrock, Azure OpenAI and OpenAI, Slurm, KubeFlow and MLflow
  11. Broad support for Nvidia GPUs on premises and in public clouds
  12. Support for Amazon ECS, EKS/A, Microsoft AKS and GKE managed Kubernetes services, upstream Kubernetes and support for private datacenters, public clouds such as AWS, Microsoft Azure and GCP as well as edge/remote locations

Key Features that Accelerate your GenAI initiatives

With Rafay, you get one unified platform to provide self-service AI infrastructure to your developers and data scientists, while easily managing the ongoing operations of your AI/ML applications

Self-Service Experience

Rafay allows developers and data scientists to deploy, view, and manage their GenAI applications and infrastructure in isolation using self-service workflows via Rafay & Backstage.

AI/ML Ecosystem Support

Out of the box support for LLM providers including Amazon Bedrock, Azure OpenAI and OpenAI.

AI Applications & Source Code

Includes several generative AI and AI workbench applications with source code such as a text summarization and a chatbot app using GenAI

Any Orchestration, Any Cloud

Pre-built templates for Amazon ECS, EKS/A, Microsoft AKS and Google GKE on those public clouds as well as private data centers and edge locations.

Cluster and Workflow Standardization

Rafay’s Environment templates and Kubernetes blueprints allow platform teams to create a set of standard GenAI environments and make them available enterprise-wide.

Secure RBAC

Each developer, data scientist, researcher, etc. can create and destroy environments (but not templates built by platform teams) and operate them in isolation, governed by RBAC.

Integrated GPU and Kubernetes Metrics

Rafay automatically captures and aggregates both Kubernetes and GPU metrics at the controller in a multi-tenant time series database.

Multitenancy for AI/ML Apps

It is incredibly common for enterprises to have different teams share clusters – perhaps with specific LLM resources – in an effort to save costs. Rafay’s multi-modal multi-tenancy capabilities can easily support multiple AI/ML teams on the same Kubernetes cluster.

Chargeback & Showback

Rafay provides each isolated unit financial metrics including chargeback and showback for their AI applications across private and public clouds.

Support for Traditional AI Platforms

Rafay also supports traditional AI frameworks such as Slurm, KubeFlow and MLflow.

Leverage the power of Generative AI and Rafay to realize the following benefits:

Faster development and time-to-market for all AI/ML applications

Realize the business benefits of GenAI sooner

Democratization of data and AI skills

Creates a culture of innovation powered by GenAI

Blogs from the Kubernetes Current

Image for Introducing BRAVE (Bare Metal Replication And Virtualization Environment): A new Open Source Tool to Virtualize Bare Metal Deployments

Introducing BRAVE (Bare Metal Replication And Virtualization Environment): A new Open Source Tool to Virtualize Bare Metal Deployments

December 5, 2023 / by Robbie Gill

We are thrilled to announce our ongoing commitment to Open Source through our donation of an open-source project to the community. This initiative, known as BRAVE (Bare Metal Replication And Virtualization Environment), aims to provide a virtual, cost-effective, and automated… Read More

Image for Bare Metal Replication And Virtualization Environment (BRAVE)

Bare Metal Replication And Virtualization Environment (BRAVE)

November 30, 2023 / by Sean Wilcox

BRAVE (Bare Metal Replication And Virtualization Environment) offers a virtual, cost-efficient, convenient, automated and on-demand tool for executing use cases requiring bare metal infrastructure. Cost and complexity of bare metal deployments can be prohibitive for a number… Read More

Image for Upgrading Amazon EKS Clusters in 2023

Upgrading Amazon EKS Clusters in 2023

November 29, 2023 / by Sean Wilcox

Kubernetes is a rapidly evolving open-source project with periodic releases. And organizations embracing Kubernetes must adopt the practice of regular upgrades. This is because Kubernetes doesn’t follow the Long Term Support (LTS) concept. Instead, they have created… Read More