Today, one of the most significant benefits of Kubernetes is the ability to quickly deploy and connect applications, whether across namespaces in a cluster or across different cloud environments.
By default, a pod/application running in a Kubernetes cluster has the freedom to communicate with anything inside and outside the cluster with no security rules applied. While this model reduces friction for developers, it can seriously compromise the security posture of your applications in production given the large attack surface.
This is why the CISA recently published the Kubernetes Security Hardening Guide which recommends the use of network policies for controlling ingress/egress traffic and enforcing segmentation between different applications in your Kubernetes clusters.
What is a network policy?
In Kubernetes environments, network isolation and traffic flow enforcement are typically implemented through network policies. A network policy is essentially a construct that allows you to define the following:
- Rules for what my application can talk to (different entities in my cluster, the internet, etc.) – this is very similar to firewall rules
- What type of traffic do these rules apply to (for example, does this apply to traffic coming in, i.e., ingress, or traffic going out, i.e., egress)
- The scope – which pods or applications am I using this network policy for
Network policy enforcement enables you to control the communication between your pods and services while also ensuring that your applications are properly isolated from each other to reduce the attack surface. In the case of a breach, this ensures that the impact is minimized to the least extent possible.
The example above showcases the importance of network policies for north-south and east-west protection. Notice that the default namespace is protected from the internet but the test namespace is not. If the test namespace is compromised, that means it can laterally reach out to apps even in the default namespace. Creating a network policy protects east-west traffic from reaching app 2, app 2 is protected whereas app 1 is compromised because no east-network network policy rule was created for it.
In addition, network policies can also be used to enforce isolation in shared cluster scenarios. For example, each team or application can have its own namespace and you can enforce that pods and services in one namespace cannot talk to pods and services in other namespaces.
How do I use network policies with my existing network infrastructure?
Most Container Network Interface (CNI) providers today support enforcement of network policies including several open source options such as Calico and Cilium. If the CNI provider does not support network policies, CNI chaining can be used to enforce policies. In this model, the primary CNI provider does the base network connectivity and IP address management, but things like network policy enforcement are done by the chained CNI.
In the example above, when traffic comes from the internet, it will be initially handled by the AWS CNI. The Cilium CNI will only be doing network policy enforcement once the AWS CNI has taken care of the network connectivity & IP Address Management (IPAM) responsibilities.
Helping Organizations Automate Network Policy Enforcement With Rafay
Using the Rafay platform and our available recipes for Cilium or Calico, you can set up network policies to enforce security for your workloads with the CNI of your choice. By using network policies with Rafay, you get the added capabilities of templatizing it across clusters while using features such as our Cluster Drift Detection to ensure that Cilium or Calico are not removed to always ensure that network policy is enabled.
Challenges and The Road Ahead
In the future, stay tuned to learn about how Rafay will streamline and solve for many of the key challenges in terms for operationalizing network policies including:
- Running Network Policies at scale: The operational burden in defining and enforcing network policies across a cluster or two can be handled manually but doing so at scale across a fleet of clusters in a consistent manner can be a significant exercise.
- Controlling Network Access: Today, it is extremely difficult to enable role-based access controls for network policy creation and visibility to network flows. For example, a namespace owner can only create network policies and view traffic flows for their namespaces
- Enforcing a standard Zero-Trust operating model: Admins need the ability to enforce a consistent approach to security (for example, starting with a default deny for all egress/ingress and whitelist traffic flow patterns as required) across different projects
- Enabling Data Retention: For debugging and validation purposes, admins or developers would want to be able to go back in time and check how their network configuration and flows look but enabling this kind of functionality today is burdensome.