AI & Data Science Workbench as a Service

Self-Service Access to AI Workbenches for Your Developers

Provide developers, data scientists, researchers, and all cloud users with self-service access to AI Workbenches using proven templates with guardrails included.

Why AI Workbenches-as-a-Service?

Every modern enterprise is leveraging AI. Enterprises that streamline the process of learning and experimenting with AI by providing self-service for developers and data scientists gain significant benefits.

Speed AI Journey

Empower developers and data scientists to deploy AI environments they need, when they need them.

Reduce Overhead

Reusable AI templates reduce the need for platform and cloud teams to provision environments repeatedly.

Simplify Maintenance

Platform teams now update and maintain configurations via a set of reusable templates.

Unique Rafay Capabilities for AI Workbenches-as-a-Service

Dozens of enterprise platform teams leverage these unique features to rapidly build AI Workbenches-as-a-service automation with Rafay and delight their developers.

Lifeycle Management

Secure self-service for namespaces

Users should be able to provision namespaces but should not have access to resources outside of their namespaces

Infrastructure as Code (IaC) support

Support for TF or GitOps first approaches, including private Git repos, that accelerate infrastructure deployment

Resource quotas for teams & applications

Define quotas to prevent noisy neighbor issues so total namespace resource requests do not exceed the configured limits

Management of namespace compliance

Easily manage compliance of pre-existing namespaces in the same manner (i.e. same guardrails) as new namespaces

Integrate with continuous delivery

Work with CD tools like Argo, enforcing guardrails (e.g. quotas, network policies) on namespaces created out of band

Centralized visibility into namespaces

Use cross account and cross cloud visibility to manage complex multi-cloud environments across teams, geos, and domains

Streamlined disaster recovery

Leverage one step workflows to quickly and safely restore data from backups during disaster recovery events

Developer Self-Service

Flexible interfaces

Ability to consume the platform through the preferred interface: UI, Backstage, GitOps or CMDBs (e.g. ServiceNow)

Simple, streamlined process for requesting compute

No time consuming ticket driven process where the Platform team has to manually provision namespaces

Visualization of namespace resources

View into “what resources” are violating policies so that it is easy to remediate and course correct (for future actions)

Streamlined experience for kubectl access

To help with scenarios such as:

  • Application right sizing exercises
  • Requesting platform team for additional compute
Repository of approved applications

Integrated, low touch experience for installing applications that have been scanned for vulnerabilities etc.

Governance

Network policies for namespace isolation

Enforce network policies so that namespaces belonging to different teams cannot communicate with each other

Just in Time (JIT) user identity driven access

Implement RBAC at scale with your Identity Provider, without implementing expensive solutions (bastion, VPN, etc.) so users access only their namespaces.

Centralized kubectl access audits

Centralized visibility into user actvities + ability to export audits to an external system (e.g. Splunk, Datadog)

Chargeback and showback

Collect cluster utilization metrics for chargeback / showback models, including sharing costs across tenants for unallocated resources and common services

Identify underutilized namespaces

Collect of Granular utilization metrics from namespaces to show usage by CPU, Memory

Centralized policy enforcement

Enforce policies for security, reliability and operational efficiency, with centralized visibility into policy violations

Compliance benchmarks

Run periodic scans against benchmarks (CIS, NSA hardening recommendations etc.) and centrally aggregate the benchmark reports

Deployment Features

SaaS based

The default option – providing maximum efficiency and reliability for mature and growing customers

Self-hosted

A self-hosted, airgapped option may be necessary for highly regulated industries

Multi-tenant

“Namespace as a service” across multiple teams, with isolation and tight access controls

Download the Templates

More downloadable templates are coming soon. So, to get started providing self-service access to AI environments in your enterprise, talk to us about one of the templates below.

Workbench: KubeFlow with Amazon EKS

Environment
Kubernetes
LLM

AI Workbench with KubeFlow on AWS

Template

GenAI on EKS

Environment

Bedrock models running on Amazon EKS

Template

GenAI on ECS

Environment

Bedrock models running on Amazon ECS

Template

RAG: Anthropic Claude on AWS Bedrock

Environment:
Kubernetes
LLM

Retrieval Augmented Generation for Claude on AWS

RAG: Cohere on AWS Bedrock

Environment:
Kubernetes
LLM

Retrieval Augmented Generation for Cohere on AWS

RAG: Llama-2 on AWS

Environment:
Kubernetes
LLM

Retrieval Augmented Generation for Llama-2 on AWS

RAG: Mistral on AWS

Environment
Kubernetes
LLM

Retrieval Augmented Generation for Mistral on AWS

RAG: Zephyr on AWS

Environment
Kubernetes
LLM

Retrieval Augmented Generation for Zephyr on AWS

RAG: Llama-2 on OCI

Environment:
Kubernetes
LLM

Retrieval Augmented Generation for Claude on OCI

RAG: Mistral on OCI

Environment
Kubernetes
LLM

Retrieval Augmented Generation for Mistral on OCI

RAG: Zephyr on OCI

Environment
Kubernetes
LLM

Retrieval Augmented Generation for Zephyr on OCI

Finetuning: Llama-2 on AWS

Environment
Kubernetes
LLM

Finetuning Llama-2 on AWS

Finetuning: Mistral on AWS

Environment
Kubernetes
LLM

Finetuning Mistral on AWS

Finetuning: Llama-2 on OCI

Environment
Kubernetes
LLM

Finetuning Llama-2 on OCI

Finetuning: Mistral on OCI

Environment
Kubernetes
LLM

Finetuning Mistral on OCI

Co-Pilot: Wizardcoder on AWS

Environment
Kubernetes
LLM

Co-Pilot: Wizardcoder on OCI

Environment
Kubernetes
LLM

Generative AI Co-Pilot with Wizardocder on OCI

RLHF/DPO: Wizardcoder on AWS

Environment
Kubernetes
LLM

Optimizing LLMs using Wizardocder on AWS

RLHF/DPO: Wizardcoder on OCI

Environment
Kubernetes
LLM

Optimizing LLMs using Wizardocder on OCI

Code Generation: Wizardcoder on AWS

Environment
Kubernetes
LLM

Code Generation using Wizardocder on AWS

Code Generation: Wizardcoder on OCI

Environment
Kubernetes
LLM

Code Generation using Wizardocder on OCI

Speech-To-Text: Whisper on AWS

Environment
Kubernetes
LLM

Convert Speed To Text Using Whisper on AWS

Speech-To-Text: Whisper on OCI

Environment
Kubernetes
LLM

Convert Speed To Text Using Whisper on OCI

Workbench: Jupyter Notebook with Amazon EKS

Environment
Kubernetes
LLM

AI Workbench with Jupyter Notebook on AWS

Workbench: MLFlow with Amazon EKS

Environment
Kubernetes
LLM

AI Workbench with MLFLow on AWS

HPC: SLURM on AWS

Environment
Kubernetes
LLM

Deploy SLURM on AWS

Download the White Paper
How Enterprise Platform Teams Can Accelerate AI/ML Initiatives

"Rafay’s unified view for Kubernetes Operations & deep DevOps expertise has allowed us to significantly increase development velocity."

Alec Rooney

CTO