Create, deploy, operate, monitor, upgrade and retire Kuberentes clusters across multiple, heterogeneous regions, clouds and environments
Kubernetes can accelerate feature development and enable organizations to remain agile and competitive. Using multiple Kubernetes clusters is already the norm for many organizations due to:
- The need to deliver great user experiences requiring a globally distributed, multi region application footprint
- Interest in taking advantage of the underlying VM-based Infrastructure-as-a-Service (IaaS)
- A strong preference for “application specific clusters” driven by the need for separation, isolation and control
This results in an explosion of Kubernetes clusters and the subsequent need for “multi-cluster management”.
The Rafay team has been using Kubernetes for over four years, operating mission critical workloads. Kubernetes powers the Rafay Systems multitenant SaaS platform and our global sandbox of clusters deployed across multiple providers.
Organizations can leverage Rafay to help bridge the gap between theory and practice with Kubernetes, thus avoiding the steep learning curve and mundane tasks associated with the entire lifecycle of Kubernetes.
Rafay provides a dramatically simplified, streamlined set of workflows that removes the need for organizations to hire and retain a large team of Certified Kubernetes Admins (CKA) to manage and operate a fleet of Kubernetes clusters.
Provisioning a production grade Kubernetes cluster is not a trivial task. In addition to addressing the complexity of securely deploying and configuring the various components, organizations also have to support a broad set of use cases. This requires multiple Kubernetes bring up approaches.
- Provision on Any Type of Infrastructure
The Rafay platform provides organizations with highly optimized workflows to efficiently provision upstream Kubernetes on any type of infrastructurel: bare metal, VMs, cloud providers like AWS, GCP, Azure, etc.
Organizations can use Rafay to bring up Kubernetes clusters in bandwidth constrained or offline locations with an installer that comes prepackaged with all the required software components.
Organizations can also use Rafay for streamlined provisioning, management and deprovisioning of clusters using managed Kubernetes providers such as EKS, GKE and AKS.
Organizations can also use Rafay to provide visibility and monitoring of clusters provisioned by OpenShift, D2IQ or any third-party Kubernetes distribution.
Organizations can leverage Rafay to manage the lifecycle of clusters with high levels of automation. In addition to prescriptive, guided, GUI based workflows, organizations can also drive cluster lifecycle workflows programmatically to embed their workflows.
- Use Case Optimized
Workflows need to be flexible and optimized to support a variety of use cases. For example, a cluster for QA testing may not require the same level of performance or availability as a production cluster. In order to save cost, organizations may wish to deploy this as a single node, converged cluster. They may also wish to embed this into their Jenkins based pipeline, programmatically deploying the cluster on demand and tear it down after testing is complete.
A Kubernetes cluster with only its master, worker nodes and etcd is not exactly useful. Several critical third party software add-ons need to be deployed before the cluster can be considered ready for application deployments. For example:
- Prometheus, or its equivalent, has to be deployed and correctly configured to ensure the cluster and deployed applications can be continuously monitored.
- Fluentd, or its equivalent, has to be deployed and correctly configured so that logs from both the cluster and deployed applications can be aggregated.
Rafay provides and maintains a curated, well-tested list of critical Rafay and 3rd party software add-ons as part of a “default” Rafay blueprint. This blueprint is automatically applied on Rafay-managed clusters ensuring the cluster is made ready for mission critical applications.
Customers can also optionally bring and manage their own library of software add-ons. They can use these add-ons with their custom “cluster blueprints” for their fleet of clusters.
Cluster Blueprints help provide a guarantee that clusters always have the specified list of critical third-party software. This enables organizations to dramatically simplify and operationalize the process of ensuring that clusters are in compliance with their business requirements.
For example, an organization dealing with payments may create and maintain a blueprint called “PCI Blueprint” that may require an Istio Service Mesh and Stackrox security software as mandatory add ons on all clusters targeted for PCI.
Organizations that manage multiple Kubernetes clusters need a centralized multi-cluster operations console to provide deep visibility and monitoring of state and activity across clusters.
The Rafay cluster Agent maintains a heartbeat with the Rafay Controller providing organizations with a near real-time view of the state and health of the managed clusters.
With just a few clicks, and without requiring inbound access to the cluster control plane, operations teams can quickly drill down to assess the current state and health of every cluster node.
For example, they can quickly check the Kubernetes version for the Kubelet and the Operating System hosting the node. They can also check whether the node has reported memory, disk or PID pressure.
Operations personnel can easily get down to the “pod” and “namespace” level on a selected Kubernetes cluster to check on its health status. Operational users can perform this across their entire portfolio of managed clusters. They can see output identical to that of “kubectl” without requiring inbound access to the cluster’s control plane or requiring access to privileged roles and permissions.
The Rafay Controller continuously monitors the health of both the cluster and the deployed applications. Cluster operations teams will be immediately sent an “alert” when the controller notices an issue.
For example, an alert is generated if the Controller notices there is a loss of cluster nodes resulting in significantly diminished cluster capacity. Organizations can also directly integrate these alerts directly into their existing incident response platforms such as Pager Duty, Ops Genie, BMC Remedy, etc.
The entire history of alerts is maintained and can be leveraged for critical business and operational decisions. For example, the pattern of alerts may indicate that a certain provider has very frequent outages and the organization may decide to transition away from them.
Kubernetes, and its associated eco-system of third party software add-ons, changes frequently and quickly. There is typically a new Kubernetes version every three months. Depending on the nature of change, the associated third party software ecosystem may have to be updated as well.
On top of this, organizations should assume there will be frequent security related patches or bug fixes that have to be applied on an ongoing basis.
Organizations managing clusters require automation and curated workflows to ensure this entire process can be performed quickly, predictably without a lot of effort or impact to business applications.
Rafay helps automate and streamline the entire set of workflows required to detect, report and upgrade both k8Kubernetes and the third-party software add-ons across clusters. Rafay’s intelligent Kubernetes upgrade workflows ensure that a customer’s business critical applications are never impacted during the upgrade process because the Rafay platform assumes:
- The application should not require downtime during the Kubernetes upgrades process
- The application should not suffer from lack of orchestration capabilities during k8s upgrade process
- Upgrades can be scheduled and performed in customer provided time windows
- Upgrades are always performed with a canary approach – one canary cluster first, then followed by remaining clusters
- The application should be able to operate in a heterogeneous Kubernetes environment for extended periods of time – some clusters on latest versions and remaining on prior version.
Although, Kubernetes and containers allow organizations to automate many aspects of application deployment, providing significant business benefits, they can be vulnerable to attacks if not adequately protected. In addition, they expose potentially new attack surfaces for container deployments which previously did not exist, and thus will be attempted to be exploited by attackers.
Harden your Cluster
Rafay helps organizations harden their Kubernetes clusters by ensuring that critical cluster wide configuration settings are used and enabled (e.g. strong encryption and secure access for etcd, etc).
Organizations can ensure that clusters are compliant with internal policy by leveraging cluster blueprints to guarantee that clusters always have mission critical security and software add ons that may necessary for activity monitoring.
Zero Trust and Secure Access
Organizations managing a global fleet of Kubernetes clusters cannot risk having their cluster control plane open to attackers on the Internet.
Rafay enables organizations implement a “Zero Trust Model” access model to the cluster’s control plane. Organizations can completely cloak the cluster’s control plane only allow inbound access to the control plane from a bastion only to select, highly privileged administrators.
Organizations can easily implement separation of duties by ensuring that application and operation team tasks are focused on their responsibilities.
Rafay enables organizations avoid cluster access credential sprawl with creating, managing and monitoring the lifecycle of hundreds of roles and permissions for every user across the entire fleet of clusters.
Rafay enables organizations enforce secure, fine grained access to applications and clusters via role based access control (RBAC) augmented with MFA and Single Sign On (SSO) via integration with their Identity Provider such as Azure Active Directory, Okta, etc.
No Blind Spots
Rafay provides organizations with complete visibility and insight into all aspects of a cluster across the entire fleet and applications operating on them. Rafay captures and maintains a complete and detailed audit trail of all activities performed by users ensuring that there are no blind spots for the security team.
Always Up to Date
Rafay provides organizations with a streamlined way by which they can keep their entire fleet of clusters up to date with the latest security patches and updates.
Encrypted Secrets Delivery and Management
Organizations can prevent orphaned secrets on clusters by automating and securing the delivery, provisioning and de-provisioning of secrets across multiple clusters. The secrets are automatically injected just in time (JIT) before the application workload is deployed to the target clusters.
Organizations can optionally also increase the security posture of their applications leveraging Rafay’s out-of-box integration with a corporate secrets management platform such as Hashi Corp’s Vault.