Using sctool backup command, you can schedule a backup of a managed cluster. Backups and repairs are scheduled in the same manner, you can start, stop, resume, and track task progress on demand. The following backup storage engines are supported:
S3 compatible API storage providers such as Ceph or MinIO,
Google Cloud Storage.
Glob patterns to select keyspaces or tables to backup
Deduplication of SSTables
Retention of old data
Throttling of upload speed
Configurable upload destination per datacenter
Pause and resume
The backup procedure consists of multiple steps executed sequentially.
Snapshot - Take a snapshot of data on each node (according to backup configuration settings).
(Optional) Schema - Upload the schema in CQL text format to the backup storage destination, this requires that you added the cluster with CQL username and password. If you didn’t you can update the cluster using sctool at any point in time.
Upload - Upload the snapshot to the backup storage destination.
Manifest - Upload the manifest file containing metadata about the backup.
Purge - If the retention threshold has been reached, remove the oldest backup from the storage location.
You need to create a backup location for example an S3 bucket. We recommend creating it in the same region as Scylla nodes to minimize cross region data transfer costs. In multi-dc deployments you should create a bucket per datacenter, each located in the datacenter’s region.
Details may differ depending on the storage engine, please consult: