Skip to main content
Version: 1.28.6

Kubernetes Fury Disaster Recovery

Overview

Kubernetes Fury DR module is based on Velero and Velero Node Agent.

Module's repository: https://github.com/sighupio/fury-kubernetes-dr

Velero allows you to:

  • backup your cluster
  • restore your cluster in case of problems
  • migrate cluster resources to other clusters
  • replicate your production environment to development and testing environment.

Together with Velero, Velero Node Agent allows you to:

  • backup Kubernetes volumes
  • restore Kubernetes volumes

And by using the snapshot-controller, the support for CSI Snapshot Data Movement can be enabled, which allows you to:

  • backup the volume data to a pre-defined backup storage
  • have consistent backups of your data

The module contains also velero plugins to natively integrate with Velero with different cloud providers and use cloud provider's volumes as the storage backend.

Packages

Kubernetes Fury DR provides the following packages:

PackageDescription
veleroBackup and restore, perform disaster recovery, and migrate Kubernetes cluster resources and persistent volumes.

The velero package contains the following additional components:

ComponentDescription
velero-node-agentIncremental backup and restore of Kubernetes volumes.
velero-schedulesCommon schedules for backup
info

All the components are deployed in the kube-system namespace in the cluster.

Introduction: Disaster Recovery in Kubernetes

Disaster recovery (DR) is a critical process for ensuring business continuity in the face of unexpected events like hardware failures, cyberattacks, or natural disasters. A robust DR strategy involves preparing for data loss and system downtime by implementing tools and processes to restore normal operations quickly and reliably. Effective DR plans encompass backup and recovery, failover mechanisms, and testing to ensure preparedness.

In Kubernetes, disaster recovery presents unique challenges due to the dynamic, distributed, and ephemeral nature of containerized environments. A Kubernetes DR strategy should include:

  • Cluster State Backup: Saving resource definitions like deployments, services, and configurations.
  • Persistent Volume Backups: Ensuring that application data stored in persistent volumes is preserved.
  • Cross-Cluster Recovery: Enabling the restoration of workloads to a different cluster or cloud region in case of a total failure.

A comprehensive DR solution for Kubernetes must address both application data and the cluster's configuration state, ensuring seamless recovery.

KFD: DR Module

KFD provides Velero to perform backups and store them safely.

KFD will automatically install the appropriate Velero components based on the Provider of the cluster and the configuration you provide. More specifically:

  • OnPremises: KFD will install Velero, Velero Node Agent to perform backups for PVCs, and an optional MinIO instance to store the backups.
  • EKSCluster: KFD will install Velero, and the velero-aws plugin to handle backups natively in AWS.

Regardless of the provider, KFD creates two distinct types of backup:

  1. Manifests: it copies the manifests from the ETCD database.
  2. Full: it performs a full backup of PersistentVolumes that are currently used inside the cluster.

You can specify a retention period and a cron-like schedule for both of them. See more in the Provider's schema

Requirements

This module requires the KFD Monitoring module to be installed in the cluster, because it provides the ServiceMonitor CRD that Velero uses to install a ServiceMonitor definition.

Velero

Velero (formerly Heptio Ark) gives you a tool to back up and restore your Kubernetes cluster resources and persistent volumes. You can run Velero with a cloud provider or on-premises. Velero lets you:

  • Take backups of your cluster and restore in case of loss.
  • Migrate cluster resources to other clusters.
  • Replicate your production cluster to development and testing clusters.

Velero consists of:

  • A server that runs on your cluster
  • A command-line client that runs locally

Read More