Automated Solr Backups in Fusion 5
You can schedule backups of Solr collections, store the backups for a configurable period of time, and restore these backups into a specified Fusion cluster when needed.
This guide makes the following assumptions:
-
You are using Google Kubernetes Engine (GKE).
-
You used the setup scripts in the fusion-cloud-native repository to install Fusion.
Implementation
Backups are taken using the Solr collection BACKUP command. This requires that each Solr node has access to a shared volume or a ReadWriteMany
volume in Kubernetes. Most cloud providers provide a simple way of creating a shared filestore and exposing it as a PersistentVolumeClaim
within Kubernetes to mount into the Solr pods. An option is added into the setup_f5_.sh scripts in the fusion-cloud-native repository to provision these.
The backup action of the script is invoked by a Kubernetes CronJob to configure the backup schedule. The backups are saved to a configurable directory with an automatically generated name: <collection_name>-<timestamp_in_some_format>
.
A separate CronJob is responsible for cleanup and retention of backups, letting cleanup be disabled. The retention periods allows to be less frequent backups as they become older. For example, a cluster that backs up a collection every 3 hours could specify a retention policy that:
-
keeps all backups for a single day.
-
keeps a single backup a day for a week.
-
keeps a single backup a week for a month.
-
keeps a single backup a month for 6 months.
-
deletes all backups that are older than this time.
All times are configurable as part of the configmap
for this service.
The process for restoring a collection is a manual step involving kubectl run
. This action invokes the Solr RESTORE
action pointing to the collection and name of the backup that should be restored.
These instructions are for GKE only. For other platforms, backup and restoration involves copying the collection to the cloud and using Parallel Bulk Loader. |
Installation
The solr-backup-runner requires that a ReadWriteMany
volume is mounted onto all the Solr pods and the backup-runner pods so all pods back up to a consistent filesystem.
GKE example
The easiest way to install in GKE is by using a GCP Filestore volume as the ReadWriteMany volume.
-
Create the filestore.
gcloud --project "${GCLOUD_PROJECT}" filestore instances create "${NFS_NAME}" --tier=STANDARD --file-share=name="solrbackups,capacity=${SOLR_BACKUP_NFS_GB}GB" --zone="${GCLOUD_ZONE}" --network=name="${network_name}"
-
Fetch the IP of the filestore when it has created.
NFS_IP="$(gcloud filestore instances describe "${NFS_NAME}" --project="${GCLOUD_PROJECT}" --zone="${GCLOUD_ZONE}" --format="value(networks.ipAddresses[0])")"
-
Create a Persistent Volume in kube that is backed by this volume.
cat <<EOF | kubectl -n "${NAMESPACE}" apply -f - apiVersion: v1 kind: PersistentVolume metadata: name: ${NAMESPACE}-solr-backups annotations: pv.beta.Kubernetes.io/gid: "8983" spec: capacity: storage: ${SOLR_BACKUP_NFS_GB}G accessModes: - ReadWriteMany nfs: path: /solrbackups server: ${NFS_IP} EOF
-
Create a Persistent Volume Claim in the namespace that Solr is running in.
cat <<EOF | kubectl -n "${NAMESPACE}" apply -f - apiVersion: v1 kind: PersistentVolumeClaim metadata: name: fusion-solr-backup-claim spec: volumeName: ${NAMESPACE}-solr-backups accessModes: - ReadWriteMany storageClassName: "" resources: requests: storage: ${SOLR_BACKUP_NFS_GB}G EOF
-
Add the following values to your existing (or a new) helm values file.
solr-backup-runner: enabled: true sharedPersistentVolumeName: fusion-solr-backup-claim solr: additionalInitContainers: - name: chown-backup-directory securityContext: runAsUser: 0 image: busybox:latest command: ['/bin/sh', '-c', "owner=$(stat -c '%u' /mnt/solr-backups); if [ ! \"${owner}\" = \"8983\" ]; then chown -R 8983:8983 /mnt/solr-backups; fi "] volumeMounts: - mountPath: /mnt/solr-backups name: solr-backups additionalVolumes: - name: solr-backups persistentVolumeClaim: claimName: fusion-solr-backup-claim additionalVolumeMounts: - name: solr-backups mountPath: "/mnt/solr-backups"
-
Upgrade the release and Solr backups is enabled.