Skip to content

tomgeorge/backup-restore-operator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Backup and Restore Kubernetes Operator

An operator to easily back up the data of a stateful application in Kubernetes. Uses CSI for kubernetes-native backup management.

Contents

  1. Prerequesites
  2. Installing the Operator
  3. Creating a Backup

Prerequisites

  • Kubernetes 1.12+, with the VolumeSnapshotDataSource=true feature gate set on the API server. See the documentation to learn how to enable a feature gate on a kubernetes component.
  • A CSI storage driver intsalled in the cluster. See the list of drivers for a list of options.

Installing the Operator

kubectl create namespace backup-restore-operator

kubectl apply -f deploy/

kubectl apply -f deploy/crds/

This will create:

  • A backup-restore-operator ServiceAccount in the backup-restore-operator namespace
  • ClusterRole and CluserRoleBinding for the service account to perform necessary API operations
  • A backup-restore-operator Deployment that runs the controllers and watches for VolumeBackup requests.
  • VolumeBackup and VolumeBackupProvider CRDs.

Verify that the operator is running:

λ ~/go/src/github.com/tomgeorge/backup-restore-operator/ master* kubectl get pods -n backup-restore-operator | grep backup
backup-restore-operator-7c7c89d976-msx5z      1/1     Running   0          25s

Also check the logs to make sure you don't see any errors.

Creating a Backup

  • Create a deployment with a pre-hook and post-hook to quiesce the application. The following is an example mysql deployment with a volume and hooks to freeze/unfreeze the application:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: mysql
      labels:
        app: mysql-persistent
        template: mysql-persistent-template
    spec:
      replicas: 1
      selector:
        matchLabels:
          name: mysql
      template:
        metadata:
          labels:
            name: mysql
          annotations:
            backups.example.com.pre-hook: "mysql -h 127.0.0.1 --user=root --password=$MYSQL_ROOT_PASSWORD --database=$MYSQL_DATABASE -e 'flush tables with read lock;'"
            backups.example.com.post-hook: "mysql -h 127.0.0.1 --user=root --password=$MYSQL_ROOT_PASSWORD --database=$MYSQL_DATABASE -e 'unlock tables;'"
        spec:
          containers:
          - env:
            - name: MYSQL_USER
              valueFrom:
                secretKeyRef:
                  key: database-user
                  name: mysql
            - name: MYSQL_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: database-password
                  name: mysql
            - name: MYSQL_ROOT_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: database-root-password
                  name: mysql
            - name: MYSQL_DATABASE
              valueFrom:
                secretKeyRef:
                  key: database-name
                  name: mysql
            image: mysql
            imagePullPolicy: IfNotPresent
            livenessProbe:
              failureThreshold: 3
              initialDelaySeconds: 30
              periodSeconds: 10
              successThreshold: 1
              tcpSocket:
                port: 3306
              timeoutSeconds: 1
            name: mysql
            ports:
            - containerPort: 3306
              protocol: TCP
            readinessProbe:
              exec:
                command:
                - /bin/sh
                - -i
                - -c
                - MYSQL_PWD="$MYSQL_PASSWORD" mysql -h 127.0.0.1 -u $MYSQL_USER -D $MYSQL_DATABASE
                  -e 'SELECT 1'
              failureThreshold: 3
              initialDelaySeconds: 5
              periodSeconds: 10
              successThreshold: 1
              timeoutSeconds: 1
            resources:
              limits:
                memory: 512Mi
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            volumeMounts:
            - mountPath: /mysql-data
              name: mysql-data
              subPath: mysql-data
          dnsPolicy: ClusterFirst
          restartPolicy: Always
          schedulerName: default-scheduler
          securityContext: {}
          terminationGracePeriodSeconds: 30
          volumes:
          - name: mysql-data
            persistentVolumeClaim:
              claimName: mysql-claim
    • Create a VolumeBackup object with a reference to the application:
    apiVersion: backups.example.com/v1alpha1
    kind: VolumeBackup
    metadata:
      name: example-volumebackup
    spec:
      applicationName: mysql
      containerName: mysql
      volumeName: mysql-claim

You should now see a VolumeBackup and it's accompanying VolumeSnapshot and VolumeSnapshotContent objects:

kubectl get volumebackups,volumesnapshot,volumesnapshotcontents -o yaml

Backup Flow

Given a request for backup of a pvc: Find all the pods that use that pvc Call freeze Use snapshot.external-storage.k8s.io api to create snapshot Watch VolumeSnapshot object until it has a timestamp Unfreeze the pods Watch the VolumeSnapshot to wait for the upload to complete (wait for readyToUse) Create a VolumeClaim that has the source set to the VolumeSnapshot object Create a Pod with the VolumeClaim mounted inside of it Sync the mount point with an object store (rsync, restic, look for tools) Need two fields for this: the backup provider and the subdirectory

Restore Flow

Given a Restore object that references a Deployment, a PersistentVolumeClaim, and a VolumeSnapshotData object: Create a PVC with the snapshot-promoter storage class Update the deployment with the newly created PVC Application restarts Optional: Delete the old PVC

Subsequent deployments of an application that has been restored could remove the previously restored data

Pairing Notes

  • Change ApplicationName to just a pod selector? That way you don't have to care what kind of higher-level application you are backing up
  • Document that we are performing the backup on pod 0 of the returned list of pods
  • Need to check if we have frozen/unfrozen the Pod before we perform the freeze. Track this in the Status of the VolumeBackup
  • PodExecutor assumes the zeroth container, should pass in the container we want
  • Have to update the status of the VolumeBackup saying that it has been frozen, and return
  • Before we issue a backup, we have to check whether or not a backup has been performed (if the snapshot exists)
  • Rename issueBackup to issueSnapshot
  • Add a new field to the VolumeBackup specifying the VolumeSnapshotClass

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages