The Kubernetes Current Blog

Cluster Blueprints and Drift Detection

Around three years back, we noticed many of our customers struggling with enterprise-wide standardization of their Kubernetes clusters. Every cluster in their Organization was a snowflake and they were looking for a way to enforce that every cluster had a “baseline set of add-ons”. This prompted us to develop Cluster Blueprints which has turned out to be one of the most heavily used features in our platform.

``` mermaid
sequenceDiagram
Git Repo->>Git Repo: Pull Request
Git Repo->>Git Repo: Merge
Git Repo->>Controller: Webhook
Controller->>Cluster: Update Blueprint
Cluster->>Cluster: Monitor
Rogue Admin-->>Cluster: Attempts out of band change
Cluster-->>Controller: Audit Event
Cluster-->>Rogue Admin: "X" Attempt Blocked "X"
Parse error on line 1:
``` mermaidsequence
^
Expecting 'NEWLINE', 'SPACE', 'GRAPH', got 'PUNCTUATION'

In this blog, we will describe a superpower setting in the cluster blueprints feature that we see customers use heavily for their production clusters to secure against unplanned drift.

![Blueprints Icon](img/bp-drift/blueprint.png)

The Drift Problem

Although cluster blueprint solves the “standardization” challenge, it is still possible for users with Cluster Admin privileges to make “accidental” changes to the add-ons associated with a cluster blueprint.

When something like this occurs, the cluster would have “drifted” away from the desired state. Unplanned, out-of-band changes can result in significant operational, compliance and security issues. For example, what if this update impacted the configuration of a critical security scanner?

Drift Detection

Cluster Blueprints in the Rafay Kubernetes Operations platform can be configured to actively monitor for unexpected drift. This monitoring and enforcement is performed by the Rafay Kubernetes Operator deployed on the managed cluster. Customers have two options for response when drift is detected.

Option 1: Notify

Generates an audit event when unplanned drift is detected.

Option 2: Block

Block the uplanned drift and generate an audit event.

It is a good operational practice to ensure that all updates to production clusters are “planned”, “version-controlled” and “approved”. The image below shows an environment where “drift detection-based blocking” can be used in use in conjunction with a modern GitOps-based pipeline performing the “allowed/planned update”.

sequenceDiagram
Git Repo->>Git Repo: Pull Request
Git Repo->>Git Repo: Merge
Git Repo->>Controller: Webhook
Controller->>Cluster: Update Blueprint
Cluster->>Cluster: Monitor
Rogue Admin-->>Cluster: Attempts out of band change
Cluster-->>Controller: Audit Event
Cluster-->>Rogue Admin: "X" Attempt Blocked "X"
Git RepoControllerClusterRogue AdminPull RequestMergeWebhookUpdate BlueprintMonitorAttempts out of band changeAudit Event“X” Attempt Blocked “X”Git RepoControllerClusterRogue Admin

Here’s an example of what the “Rogue Admin” would encounter when they try to delete a “drift protected” resource in the cluster blueprint.

![Blocked Update](img/bp-drift/blockeddrift.png)

Try It Out

If you are interested in trying this out yourself, sign up for a Free Org/Tenant and use our “Getting Started Guide” for Cluster Blueprints and Drift Detection.

[Get Started with Drift Detection](../../learn/quickstart/blueprint/driftdetection/overview.md){ .md-button }

Blog Ideas

Sincere thanks to those who spend time reading our product blogs and provide us with feedback and ideas. Please contact the Rafay Product Team if you would like us to write about specific topics.

Author

Trusted by leading companies