A Differentiated Approach to Scaling Kubernetes Cluster Management & Operations

How Team Rafay is extending Kubernetes to meet the needs of large enterprises and service providers

It’s been a fast-paced 2 years here at Rafay, with the company maturing into a healthy startup with engaged customers and a very busy engineering team focused on delivering . a turnkey solution for Multi-Cluster Management & Application Operations. Many of us on the team also get to directly interact with customers’ DevOps and Operations engineers. Version-1 of our core platform has racked up a lot of miles in the field, allowing the team to collect enough data about where vanilla Kubernetes falls short for enterprises.

Over the last 6 months, my colleagues and I have been working on a number of extensions to Kubernetes that address a myriad of use cases. Our customers have expressed a lot of interest in better understanding our implementation, specifically the components that reside on their Kubernetes clusters. As our work output rolls out as part of the platform’s Version-2 release, we would like to share the implementation’s core design with the community. We also intend to open-source our implementation.

High-Level Goals and Architecture

Our deep-rooted customer engagements helped us put together a clear list of requirements that needed to be addressed by the platform:

1. Must not need inbound ports to be opened on firewalls: Enterprises will operate Kubernetes clusters in heterogeneous environments. Be it in a VPC in Amazon or in a data center, enterprise security teams prefer to not have any entity requiring inbound access from the Internet. Furthermore, any artifact on a cluster that needs to reach out to an external service must be able to carry out all external interactions over HTTPS (tcp:443). Mutually authenticated TLS sessions are always desirable.

2. Must be able to federate multiple clusters into a manageable fleet: Enterprises tend to operate multiple Kubernetes clusters across public cloud regions, data centers, and the Edge. Customers must be able to manage all clusters as a fleet, not each cluster individually.

3. Must provide cluster bringup workflows with fleet-wide customization capabilities: If an enterprise has standardized on a certain methodology for logs & metrics collection (e.g. use fluentd and prometheus, respectively), TLS termination (e.g. use nginx as the ingress controller), etc., there must be an easy workflow for the DevOps team to apply such requirements across the entire fleet as needed, be it in the cloud, on premises, or at the Edge.

4. Must provide a way to normalize multiple configuration formats: An enterprise is likely to have multiple teams spread across multiple geographies and working independently on different applications. Enforcing a single configuration management framework across the enterprise may be highly impractical. Teams should be able to use their prefered format: Helm, Kustomize or k8s native YAML. The platform should be able to normalize across any configuration  into a single format.

5. Must guarantee real-time reconciliation of configuration across clusters: When operating a fleet of Kubernetes clusters, ensuring that no single cluster experiences configuration drift (due to pilot error, for example), is a non-trivial task. The platform should be able to detect configuration drifts across the fleet and resolve them quickly.

In addition to the above objectives, we also wanted to keep our implementation’s footprint on the cluster as small as possible to maximize the resources available for the customer applications.

With these objectives in mind, we came up with the following architecture:

Differentiated Blog image

  • Rafay Config: This component resides in Rafay’s central, multi-tenant, cluster manager and normalizes various Kubernetes application packaging formats to the Taskset format, which is Rafay’s internal representation. A Taskset holds all the Kubernetes objects that make up the original application package, and is slotted into various phases such as “Init,“ “Install,“ “PostInstall,“ “PreDelete,“ and so on. When a Taskset is published, an immutable snapshot is created that can be applied across clusters. In the event that certain Kubernetes objects need to be customized per cluster, the Rafay Config also exposes a concept called Overrides, which can be applied to TaskSet snapshots on a per cluster basis. A Taskset snapshot also includes specifications on the namespace it belongs to, as well as the clusters that it should be deployed in.
  • Cluster Scheduler: This component resides in Rafay’s central, multi-tenant, cluster manager and is responsible for processing Taskset snapshots when they are published. As a result of its Taskset processing duties, the Cluster Scheduler determines a set of clusters where the Taskset must be applied, while taking into account any cluster-specific Overrides that may have been defined. A Taskset+Override combination results in a Cluster Task, which the Cluster Scheduler then communicates to said cluster.  The Cluster Scheduler also performs a periodic, anti-entropy test against the cluster to ensure that the cluster does not diverge from its desired state.
  • Rafay Connector: This component resides on each Kubernetes cluster and acts as a bridge between the Cluster Scheduler and the cluster Kube API Server. The Rafay Connector maintains a persistent, outbound connection with the Cluster Scheduler to communicate with the scheduler in a best-effort manner. It participates in periodic anti-entropy with the Cluster Scheduler to make sure that there is no drift in the cluster configuration.
  • Rafay Cluster Controller: This component resides on each Kubernetes cluster and is the manager for the following controllers, all built using sigs.k8s.io/controller-runtime:
    • Step: Represents an operation performed on a Kubernetes resource. The operation can be expressed as an Object that is patched, with an optional job that can work on the object. The job will have a service account that will be restricted to operating only the resource in the namespace configured in the object. A Step is created from a Step template.
    • Task: Any Kubernetes application bundle can be represented as a Task. A Task contains an ordered list of Steps that need to be executed to install a Kubernetes application. Steps are represented in .spec.init or .spec.template, which is the Tasklet Template. Any Task-specific pre-delete cleanup steps can be represented in .spec.preDelete
    • Tasklet: A Tasklet is created from the Tasklet template. It has an ordered list of “Init“, “Install“, “PostInstall“ and “PreDelete“ Steps, represented in .spec.init, .spec.install, .spec.postInstall or .spec.preDelete, respectively. Tasklets are owned by Tasks.
    • Namespace: Rafay Namespace CRD is written to manage a Kubernetes Namespace and all the namespace level objects (ResourceQuota, NetworkPolicy, etc.) in concert.

These four components are the fundamental framework of our platform, helping us achieve our design objectives and more. The flexibility of our normalized representation, along with the platform’s multi-cluster federation capabilities helped our teams rapidly implement a number of enterprise focused features on top of our base platform.

We’ve had this implementation deployed in production for some time now, which means all of our customers are already leveraging our differentiated framework for Kubernetes. If you’d like to learn more about it or try it out, please get in touch via email. If you would like to try out our platform, please, sign-up at https://app.rafay.dev for a free trial account. You can learn more about Rafay on our blog, LinkedIn posts or on Twitter. Also feel free to reach out to me directly via LinkedIn.

Last but certainly not least, I’d like to take this opportunity and thank my colleagues Amudhan Gunasekaran, Andy Zhou, Hruday Gudepu and John Dilley for their invaluable contributions to developing this framework.

Tags:
Kubernetes

Trusted by leading companies