Dynamically Provisioning Persistent Volumes with Kubernetes

Storage in the Kubernetes and container world is handled differently than it is with virtual machines (VMs) or other types of infrastructure.

Containerized applications typically scale by running multiple instances of containers in parallel. As a result, you have many more containers running at one time than you would VMs, and the lifespan of any given container instance is typically much shorter—minutes or hours. A running VM, by comparison, might persist for weeks or months. While an application running in a VM often stores data in the VM, that doesn’t make sense given the ephemeral nature of containers.

This blog explains how to address the persistent storage needs of stateful containerized applications using Kubernetes persistent volumes (PVs) and Kubernetes persistent volume claims (PVCs). It also describes how PVs can be provisioned either statically or dynamically.

What are Persistent Volumes (PVs)?

Stateful applications require data to persist beyond the lifespan of a container instance. Applications often store information for later processing or as a record changes or transactions occur. PVs allow storage volumes to be separate from container instances. A PV is an allocated storage volume within a cluster that has either been provisioned by an administrator or dynamically provisioned using Storage Classes (more on this later).

Kubernetes supports many types of volumes including both file and block storage. The various storage vendors and cloud providers have created Container Storage Interface (CSI) drivers that enable Kubernetes to take advantage of the features of their underlying storage offerings. Deciding what type(s) of storage your cluster will consume is an important management task taking into consideration static/dynamic provisioning, quality of service (QoS), can easily scale out, and more.

A PV is a cluster resource similar to other resources in Kubernetes such as a node, memory (RAM), etc. As with any cluster resources, it can be created via a configuration file written in YAML.
apiVersion: v1 kind: PersistentVolume metadata: name: 1-pv spec: capacity: storage: 5Gi volumeMode: Filesystem accessModes: - ReadWriteOnce storageClassName: slow claimRef: name: 1-pvc namespace: test hostPath: path: /var/lib/pv
Note that the definition specifies a variety of aspects about the PV such as its name, capacity, access modes, etc.

It’s helpful to think about how a simple three-node Kubernetes cluster operates to understand what is expected of a PV. Kubernetes containers run in Pods. During normal operations, Pod X running on Node A might stop and be restarted on Node C or Node B, so any PV it’s associated with (PV-X in the diagram below) will need to be accessible from every node where the Pod can run. Often that means it will need to be accessible on all nodes in the cluster. And, of course, you need to have methods to create PVs and a way for application containers running in pods to consume them.

Static PV Provisioning and Persistent Volume Claims

With static provisioning, a Kubernetes admin has to know what persistent volumes developers will need in advance and create those volumes on underlying storage by creating YAML files as shown above. Once a PV is created, it is accessible via the Kubernetes API and available for consumption.

A persistent volume claim (PVC) is a request for a PV. Claims describe the storage required including a specific name, size, and access modes. A PVC may specify some or all of the parameters of a suitable PV. If Kubernetes finds an available PV that matches the request, the PVC is bound to the PV. The YAML for a PVC might look like the following:

apiVersion: v1 kind: PersistentVolumeClaim metadata: name: wp-pv-claim labels: app: wordpress spec: accessModes: - ReadWriteOnce resources: requests: storage: 20Gi

Once the PVC is established, you can then use that claim in your Deployment or Pod specification:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: wordpress-mysql
  labels:
    app: wordpress
spec:
  selector:
    matchLabels:
      app:wordpress
      tier: mysql
   strategy:
     type: Recreate
   template:
     metadata:
       labels:
         app:wordpress
         tier: mysql
   spec:
     containers:
     - image: mysql:5.6
       name: mysql
       env:
       - name: MYSQL_ROOT_PASSWORD
         valueFrom:
           secretKeyRef:
             name: mysql-pass
             key: password
      ports:
      - containerPort: 3306
        name: mysql
      volumeMounts:
      - name: mysql-persistent-storage
        mountPath:/var/lib/mysql
     volumes:
    - name: mysql-persistent-storage
      persistentVolumeClaim:
        claimName: mysql-pv-claim

In summary, static PV provisioning requires these steps:

Kubernetes Admin creates a PV.
Developer creates a PVC to request storage and claim the PV.
Developer configures a Pod to use the PV based on the satisfied claim.

Rafay has a tutorial that provides hands-on experience to explore these concepts.

Dynamic Provisioning

Static provisioning works okay to a point, but it either requires a lot of guesswork on the part of the administrator (in order to create PVs that applications may need) or a lot of back and forth between developers and admins. As your Kubernetes environment expands, this can become a bottleneck.

Dynamic provisioning solves this problem. Instead of the Kubernetes admin creating specific PVs, the admin defines Storage Classes. Each StorageClass has a specific storage backend from which PVs can be provisioned automatically to meet an application’s requirements. For example, you might have separate storage classes for block and file storage. Classes can be as fine grained as you want them to be. You could have slow, medium, and fast block storage, or classes that are always backed up or never backed up.

To create a StorageClass, the administrator must specify a provisioner and parameters appropriate to that provisioner. For example, Kubernetes has an internal provisioner for AWS EBS (block) storage: kubernetes.io/aws-ebs.

Kubernetes provides a variety of internal provisioners which are described in the Kubernetes documentation here. External providers are also available or can be created.

With dynamic provisioning, a developer can use a PVC to request a specific storage type and have a new PV provisioned automatically. The PVC must request a StorageClass that has already been created and configured on the target cluster by an admin for dynamic provisioning to work.

Dynamic Provisioning Benefits

There are several benefits from using dynamic provisioning, especially as your operations grow:

Simplified workflows: Applications provision storage as they need it without constant back and forth between users and admins.
Reduced cost: The overhead associated with pre-provisioning volumes and managing them is eliminated, and Admins don’t have to manually support developers’ storage needs.
Reduced storage spending: Dynamic provisioning automatically allocates and deallocates PVs in response to PVCs, reducing waste from storage that is allocated but never used.
Optimized performance: PVCs are fulfilled dynamically based on StorageClass attributes, so the resulting volume more closely matches the need.

Persistent Volumes and Backup

Kubernetes environments require backups that keep up with the state of your cluster and applications to protect against disaster. If your backups are designed to back up only the PVs you know about, sooner or later you may miss some. Kubernetes backups should inspect the state of your clusters to find all PVs. Policy-based solutions help ensure that you’re backing up all of your PVs at the right intervals to deliver the desired service level.

Persistent Volumes at Rafay

With Rafay, you can use the K8s distribution of your choice without restrictions. The various services that make up the Rafay Kubernetes Operations Platform (KOP) provide easy access to all the storage capabilities of your chosen Kubernetes distribution(s).

Rafay’s Backup & Restore Service discovers dynamically provisioned PVs and makes it simple to backup all your PVs with the desired service level. The Backup & Restore Service lets you avoid costly backup mistakes and eliminates backup headaches. It supports critical backup/restore, disaster recovery, and migration use cases for Kubernetes clusters and applications running on-premises, in the cloud, and at the edge.

Rafay implements Kubernetes backup best practices in a solution that is easy-to-use, scalable, and secure, making it simple to create and manage backup policies, backup locations, and restore policies to achieve greater operational consistency and eliminate common Kubernetes backup risks.

The Rafay KOP delivers the Kubernetes management capabilities you need to ensure the success of your entire environment, helping you rationalize and standardize management across K8s clusters and applications.

Ready to find out why so many enterprises and platform teams have partnered with Rafay to streamline Kubernetes operations? Sign up for a free trial. Follow our quickstart guide to see what Rafay’s Backup & Restore Service and other Rafay services can do.

Author

Cliff Malmborg

View all posts

Streamline AI/ML Adoption: Expert Strategies to Conquer IT Hurdles and Accelerate Growth.