How to Bring Shadow Kubernetes IT into the Light

Shadow IT continues to be a challenge for IT leaders, but perhaps not in the sense that companies have seen in the past. Traditionally, shadow IT occurs within the application stack, which creates problems because the use of IT systems occurs without the approval, or even knowledge, of the corporate IT department.

DevOps practices have emerged to help address these challenges and to unleash creativity and opportunity for modern software delivery teams. However, access to the cloud has made it easier for autonomous teams to set up their own tool sets. As a result, the shadow IT problem now manifests itself in a new way: in the tooling architecture.

The explosion of container-based applications has made Kubernetes a vital resource for DevOps teams. But its widespread adoption has led to the rapid creation of Kubernetes clusters with little regard for security and costs, either because users don’t understand the complex Kubernetes ecosystem or are simply moving too fast, in order to meet deadlines.

This article explores the challenges associated with shadow Kubernetes admins and the benefits of centralizing with the IT department.

What Are Shadow Kubernetes Admins?

A shadow Kubernetes admin is a user who doesn’t wait for their IT department to provision Kubernetes clusters and instead turns to a cloud service provider to spin up Kubernetes clusters at will. Indeed, the freedom and flexibility of the cloud brings some significant business risks that IT leaders cannot ignore.

A shadow Kubernetes admin account left unattended could lead to unexpected grant privileges when new roles are created. This happens because role bindings can refer to roles that no longer exist if the same role name is used. And with every new user, group, role and permission, lack of proper visibility and control increases the risk of human error, mismanagement of user privileges and malicious attacks.

Reigning in Shadow Kubernetes Admins

To gain control of shadow Kubernetes admins, we first must understand the challenges IT teams face.

Limited Resources

First, it can be very difficult to set up role-based access control (RBAC) for Kubernetes. Natively, it supports a number of different role types and assignment options, but these are hard to manage and track. As a result, Kubernetes admins must set up everything manually, cluster by cluster, and effectively provide the right level of access that each user needs within a cluster.

Since Kubernetes is still a relatively new technology, there is an inherent talent gap in finding staff with the necessary skills and experience to administer and manage these environments properly. Now imagine the complexity of trying to scale and operate a distributed, multicluster, multicloud environment with that level of manual overhead. Not only is the process labor-intensive, but ripe for mistakes.

Cloud Flexibility and On-Demand Services

Running container-based applications in production goes well beyond Kubernetes. For example, IT operations teams often require additional services for tracing, logs, storage, security and networking. They may also require different management tools for Kubernetes distribution and compute instances across public clouds, on-premises, hybrid architectures or at the edge.

Integrating these tools and services for a specific Kubernetes cluster requires that each tool or service is configured according to that cluster’s use case. The requirements and budgets for each cluster are likely to vary significantly, meaning that updating or creating a new cluster configuration will differ based on the cluster and the environment. As Kubernetes adoption matures and expands, there will be a direct conflict between admins, who want to lessen the growing complexity of cluster management, and application teams, who seek to tailor Kubernetes infrastructure to meet their specific needs.

What magnifies these challenges even further is the pressure of meeting internal project deadlines — and the perceived need to use more cloud-based services to get the work done on time and within budget. If jobs are on the line, people will inevitably do whatever they feel they must do to get the work done, even if it means using tools and methods from outside the centralized IT system.

Benefits of Centralizing with Platform Teams

While shadow Kubernetes IT is a growing challenge that IT must control, hindering the productivity of development and operations teams is a nonstarter. Through a centralized platform team model, however, IT can manage and enforce its own standards and policies for Kubernetes environments and prevent shadow admins altogether. IT can allow multiple teams to run applications on a common, shared infrastructure that is managed, secured and governed by the enterprise platform team. Doing so can provide the following benefits:

Standardization

Define and maintain preapproved cluster and application configurations that can be reused across infrastructure and tooling architecture. Doing so not only reduces the complexity of manual cluster management, but by centrally standardizing these configurations, it enables development and operations teams to automate workflows and accelerate delivery.

Repeatability

Create and provide multiple pipelines with predefined workflows and approvals across Kubernetes clusters and application deployments to create consistency from a self-service model throughout the organization. Clearly defined and repeatable processes help to scale Kubernetes environments and optimize resources for project deliverables.

Security

Enable downstream configurable access control to clusters and workloads so developers and operations can connect users or groups appropriately. Integrating preexisting security practices and other centralized systems with cluster and application lifecycle management operations becomes the norm. RBAC techniques and user-based audit logs across clusters and environments helps manage authorization and prevent errors that lead to attacks.

Governance

Define and maintain preapproved cluster and backup and recovery policies to be enforced across the rest of the organization. Enforce best practices and reject requests for your Kubernetes infrastructure and applications in order to comply with corporate policies and industry regulations such as HIPAA and PCI.

Additional Considerations

Enabling a centralized view into all clusters across any environment, including on-premises and public cloud environments, such as AWS, Azure, GCP and OCI, can also help address issues faster. By upleveling issues from one unified source of truth, individuals and teams can collectively pinpoint the information causing any health or performance issues related to clusters. Gaining real-time insights into cluster health and performance helps teams optimize Kubernetes and stay within budget.

Due to the open nature of Kubernetes, it is very easy to make mistakes on clusters that can lead to security risks and deployment issues. Also, security breaches are likely to happen over time. There is a constant need for maintenance, application of patches and upgrades on any type of environment. Recently, Apiiro uncovered a serious vulnerability that gave attackers the opportunity to access sensitive information, such as secrets, passwords and API keys. A centralized platform can help the organization prepare for incidents proactively by maintaining a high level of control, security posture, policy compliance and audibility across the organization.

Kubernetes, while a powerful technology, can bring many operational challenges to enterprise platform teams. With the menace of shadow Kubernetes IT growing, platform teams have a challenging road ahead to deliver a solution that enables developer productivity, centralizes governance and policy management, and reduces operational overhead.

Thankfully, there are SaaS platform solutions available that allow platform teams to focus primarily on delivering modern applications, not on managing and operating Kubernetes.

Rafay’s Kubernetes Operations Platform, for example, works with any infrastructure and provides deep integrations with Kubernetes distributions to accelerate the operational readiness of platform teams to manage, secure and govern Kubernetes at scale within hours. With Rafay, enterprises take advantage of the numerous platform services, such as multicluster management, GitOps, zero-trust access service, Kubernetes policy management, backup and restore, and visibility and monitoring.

This article was originally published in The New Stack.

Author

Kyle Hunter

View all posts

Streamline AI/ML Adoption: Expert Strategies to Conquer IT Hurdles and Accelerate Growth.