Kubernetes Clusters as a Service

Self-Service Access to Clusters for Your Developers

Provide developers, cloud operations, and all cloud users with self-service access to Kubernetes clusters using proven templates with guardrails included.

Why Clusters-as-a-Service?

Modern applications require modern orchestration in the form of Kubernetes. Enterprises that streamline the process of setting up Kubernetes clusters by providing self-service access for developer and cloud operation teams gain significant benefits.

Increase Deployment Velocity

Deployments are 4x faster when clusters are available on demand vs. submitting tickets and waiting for infrastructure.

Simplify Cluster Management

The reuse of templatized clusters with policy built-in reduces ongoing management overhead.

Reduce Cognitive Load

When you free your developers from calling operations teams you pave the way for them to deliver true value to your business.

Unique Rafay Capabilities for Cluster-as-a-Service

Dozens of enterprise platform teams leverage these unique features to rapidly build cluster-as-a-service automation with Rafay and delight their developers.

K8s Lifecycle Management

Multi-cloud & distro K8s

Support for running clusters in the cloud and on-premise

Provisioning Anywhere

Provisioning support for AWS (EKS), Azure (AKS), GCP (GCP), OCI, Bare Metal (Upstream Kubernetes) and Edge (Upstream Kubernetes)

Infrastructure as Code (IaC)

Support for TF or GitOps first approaches Support for private Git repos

Automated K8s Upgrades

Orchestrate K8s upgrades in a phased manner across fleet of clusters. The platform team may want to centrally perform Day 2 ops or delegate that to downstream teams

API Deprecation Checks

Ability to check for API deprecations before a K8s version upgrade

Pre/Post Hooks for K8s upgrades

Examples of hooks include measuring application performance pre and post cluster upgrade and Initiate backup operation before cluster upgrade

Staged Rollout of Node OS upgrades

Ability to orchestrate Node OS upgrades (e.g. AMIs) in a phased manner across fleet of clusters

Bring Pre-existing clusters into Compliance easily

Ability to enforce the same guardrails as newly provisioned clusters

Disaster Recovery

Ability to do a backup/restore of Control Plane configuration (in the case of an unmanaged K8s distro) + Data Plane (for stateful applications)

Audits

Logging “who did what?” + exporting audits to an external system (e.g. Splunk, Datadog) necessary to demonstrate compliance

Developer Self Service

Flexible User Interfaces

Ability to consume the platform through the preferred interface: UI, Backstage, GitOps or CMDBs (e.g. ServiceNow)

Simple Process for Compute

No time consuming ticket driven process where the Platform team has to manually provision clusters

Visualization of Resources

Ability for end users to quickly look at resources in the clusters and perform operations via the UI. This is especially useful for non savvy users

Streamlined Kubectl Access

Do not mandate VPNs or needing to go through a bastion etc.

Temporary Kubectl Access

It shall be possible to implement a “break glass” procedure so that the application team user has temporary access to debug applications in a prod cluster

Visibility of Policy Violations

View into “what resources” are violating policies so that it is easy to remediate and course correct (for future actions)

Visibility of Resources and Cost

To help with scenarios such as: application right sizing exercises and requesting platform team for additional compute

Repository of Approved Apps

Integrated, low touch experience for installing applications that have been scanned for vulnerabilities etc.

Add-On Lifecycle Management

Installation of Add-ons

Ensure that clusters are always born with the right set of add-ons Examples for add-ons include security and monitoring tools

Cluster Overrides

Ability to inject values to a manifest dynamically so that a single add-on can be used org-wide, e.g. AWS Load Balancer Helm chart requires configuration of “clusterName”

Drift Detection & Blocking

Ensure that add-ons are always in a compliant state without needing expensive reconciliation tools

Automated Add-on Updates

Support for a staged roll out model for updating add-ons across clusters

Configurable Pre/Post Hooks for Updates

Ability to run custom scripts to determine whether the applications are running fine after an add-on update

Version Control for Add-ons

To ensure that only blessed versions of add-ons can be used and older versions can be retired/made unavailable Also allows organizations to demonstrate compliance

Golden Packs for Add-ons

Allow downstream teams to include more add-ons based on their requirements while ensuring the baseline set of add-ons mandated by the platform team always gets installed

Centralized Visibility

Visibility into add-ons/versions across clusters in the organization

Multi-Tenancy

Platform to Support Multiple Teams

Central platorm that can deliver “cluster as a service” to multiple teams within the organization with access to resources controlled by user identity

Curated List of Cluster Templates

Workflows to centrally define templates and share templates selectively with downstream users (Developers, SREs) in a consistent manner Ensures that provisioned clusters conform to approved guidelines

Version Control for Cluster Templates

To ensure that only blessed versions of templates can be used and older versions can be retired/made unavailable Also allows organizations to demonstrate compliance

Overrides for Cluster Templates

“One size fits all” approach doesn’t work for enterprises It shall be possible for end users to override cluster template configurations selectively, e.g. instance types, regions (as deemed permissible by the platform team)

Governance

Just in Time User Kubectl Access

Implementing K8s RBAC at scale with company’s IDP as source of truth without the need to implement expensive solutions such as bastions, VPNs etc.

Kubectl Access Audits

Centralized visibility into user actvities + ability to export audits to an external system (e.g. Splunk, Datadog)

Compliance Benchmarks

Ongoing scans against benchmarks such as CIS, NSA hardening recommendations etc. Ability to securely access the fleet of clusters to run periodic scans and centrally aggregate the benchmark reports

Policies

Centralized enforcement of policies for security, reliability and operational efficiency. Centralized visibility into policy violations Examples include: Only allow images from blessed repos & ensure that pods are running with appropriate privileges

Network Policies

Control egress/ingress traffic patterns for clusters (N/S traffic)

Chargeback/ Showback

Collect of Granular utilization metrics from clusters Flexible chargeback/showback models

Cost Optimization

Implement capabilities such as:TTL: e.g. cluster will be automatically deprovisioned; Schedule: e.g. cluster “scale down” and “scale up” based on configured schedule

Identify Underutilized Clusters

Collect of Granular utilization metrics from clusters to show usage by CPU, Memory

Deployment Options

SaaS and Self-Hosted

Self-hosted airgapped option may be necessary for highly regulated industries such as public sector and biotech

Download the Templates

More downloadable templates are coming soon. So, to get started providing self-service access to clusters in your enterprise, talk to us about one of the templates below.

CaaS on GCP

Environment:
Technology

Google Kubernetes Engine on Google Cloud Platform

Template

CaaS on vCluster

Environment
Technology

vCluster on any Kubernetes

Template

CaaS on vSphere

Environment
Technology

vSphere in Private Data Center

Template

CaaS on EKS

Environment:
Technology

Elastic Kubernetes Service on Amazon Web Services

Template

CaaS on ECS

Environment:
Technology

Elastic Container Service on Amazon Web Services

Template

CaaS on Azure

Environment:
Technology

Azure Kubernetes Service on Azure

Template

CaaS on Upstream Kubernetes

Environment
Technology

Upstream Kubernetes in Private Data Center, Bare Metal, Edge using PhoenixNAP

Template

CaaS on OKE

Environment
Technology

Clusters Using Oracle Container Engine for Kubernetes (OKE)

Template
White Paper
Hybrid Cloud Meets Kubernetes

Learn how to Streamline Kubernetes Ops in Hybrid Clouds with AWS & Rafay

"Rafay’s unified view for Kubernetes Operations & deep DevOps expertise has allowed us to significantly increase development velocity."

Alec Rooney

CTO