cbbackupmgr restore

cbbackupmgr restore

cbbackupmgr-restore - Restores data from the backup archive to a Couchbase cluster.

Synopsis

cbbackupmgr restore [--archive <archive_dir>] [--repo <repo_name>] [--start <backup>] [--end <backup>] [--exclude-buckets <bucket_list>] [--include-buckets <bucket_list>] [--disable-bucket-config] [--disable-views] [--disable-gsi-indexes] [--disable-ft-indexes] [--disable-data] [--force-updates][--threads <integer>] [--no-progress-bar]

Description

Restores data from the backup archive to a target Couchbase cluster.

The restore command is capable of restoring a single backup or a range of backups. When restoring a single backup, all data from that backup is restored. If a range of backups is restored, then cbbackupmgr takes into account any failovers that may have occurred in between the time that the backups were originally taken. If a failover did occur in between the backups, and the backup archive contains data that no longer exists in the cluster, then the data that no longer exists is skipped during the restore. If no failovers occurred in between backups then restoring a range of backups restores all data from each backup. If all data must be restored regardless of whether a failover occurred in between the original backups, then data should be restored one backup at a time.

The restore command is guaranteed to work during rebalances and failovers. If a rebalance is taking place, cbbackupmgr tracks the movement of vBuckets around a Couchbase cluster and ensures that data is restored to the appropriate node. If a failover occurs during the restore then the client will wait 180 seconds for the failed node to be removed from the cluster. If the failed node is not removed in 180 seconds then the restore will fail, but if the failed node is removed before the timeout then data will continue to be restored.

Options

Below is a list of required and optional parameters for the restore command.

Required

--archive <archive_dir>

  • The directory containing the backup repository to restore data from.

--repo <repo_name>

  • The name of the backup repository to restore data from.

--host <hostname>

  • The hostname of one of the nodes in the cluster to restore data to. See the Host Formats section below for hostname specification details.

--username <username>

  • The username for cluster authentication.

--password <password>

  • The password for cluster authentication.
Optional

--start <backup>

  • The name of the first backup in the backup repository to restore. If not specified this value will default to the oldest backup in the backup repository.

--end <backup>

  • The name of the last backup in the backup repository to restore. If not specified this value will default to the most recent backup in the backup repository.

--exclude-buckets <bucket_list>

  • Restores all buckets in a backup that are not specified in <bucket_list>. This flag cannot be specified at the same time as the --include-buckets flag. Takes a comma separated list of bucket names.

--include-buckets <bucket_list>

  • Restores only buckets in a backup that are specified in <bucket-list>. This flag cannot be specified at the same time as the --exclude-buckets flag. Takes a comma separated list of bucket names.

--disable-views

  • Skips restoring view definitions for all buckets.

--disable-gsi-indexes

  • Skips restoring gsi index definitions for all buckets.

--disable-ft-indexes

  • Skips restoring full-text index definitions for all buckets.

--disable-data

  • Skips restoring all key-value data for all buckets.

--force-updates

  • Forces data in the Couchbase cluster to be overwritten even if the data in the cluster is newer. By default updates are not forced and all updates use the conflict resolution mechanism in Couchbase to ensure that newer data on the cluster, if any, is not overwritten by older restore data.

--threads <num>

  • Specifies the number of concurrent clients to use when restoring data. Fewer clients means restores take longer, but there will be less cluster resources used to complete the restore. More clients means faster restores, but at the cost of more cluster resource usage. This parameter defaults to 1 if it is not specified and it is recommended that you do not set this parameter to be higher than the number of CPUs on the machine where the restore is taking place.

--no-progress-bar

  • By default, a progress bar is printed to stdout so that the user can see how long the restore is expected to take, the amount of data that is being transferred per second, and the amount of data that has been restored. Specifying this flag disables the progress bar and is useful when running automated jobs.

Host Formats

When specifying a host for the restore command the following formats are expected:

  • couchbase://<addr>
  • <addr>:<port>
  • http://<addr>:<port>

It is recommended to use the Couchbase://<addr> format for standard installations. The other two formats allow an option to take a port number which is needed for non-default installations where the admin port has been set up on a port other that 8091.

Examples

The restore command can be used to restore a single backup or range of backups in a backup repository. In the examples below, let's look at a few different ways to restore data from a backup repository. All the examples assume that the backup archive is located at /data/backups and that all backups are located in the "example" backup repository.

The first thing to do when getting ready to restore data is to decide which backups to restore. The easiest way to do this is to use the list command to see which backups are available to restore.

$ cbbackupmgr list --archive /data/backups --repo example 
 
Size      Items          Name 
2.24GB    -              + example 
1.11GB    -                  + 2016-03-08T14_41_10.757145596-08_00 
1.11GB    -                      + default 
295B      0                          bucket-config.json 
1.11GB    983797                     + data 
1.11GB    983797                         shard_0.fdb 
2B        0                          full-text.json 
128B      0                          gsi.json 
2B        0                          views.json 
430.52MB  -                  + 2016-03-09T14_42_24.024494032-08_00 
430.52MB  -                      + default 
295B      0                          bucket-config.json 
430.52MB  334400                     + data 
430.52MB  334400                         shard_0.fdb 
2B        0                          full-text.json 
128B      0                          gsi.json 
2B        0                          views.json 
728.72MB  -                  + 2016-03-10T14_42_58.743250296-08_00 
728.72MB  -                      + default 
295B      0                          bucket-config.json 
728.72MB  607500                     + data 
728.72MB  607500                         shard_0.fdb 
2B        0                          full-text.json 
128B      0                          gsi.json 
2B        0                          views.json 

From listing the backup repository we can see we have three backups that we can restore in the "examples" backup repository. If we just want to restore one of them we set the --start and --end flags in the restore command to the same backup name and specify the cluster that we want to restore the data to. In the example below we restore only the oldest backup.

$ cbbackupmgr restore --archive /data/backups --repo example \ 
--host couchbase://127.0.0.1 --username Administrator --password password \ 
--start 2016-03-08T14_41_10.757145596-08_00 \ 
--end 2016-03-08T14_41_10.757145596-08_00 

If we want to restore only the two most recent backups then we specify the --start and --end flags with different backup names in order to specify the range we want to restore.

$ cbbackupmgr restore --archive /data/backups --repo example \ 
--host couchbase://127.0.0.1 --username Administrator --password password \ 
--start 2016-03-09T14_42_24.024494032-08_00 \ 
--end 2016-03-10T14_42_58.743250296-08_00 

If we want to restore all of the backups in the "examples" directory then we can omit the --start and --end flags since their default values are the oldest and most recent backup in the backup repository.

$ cbbackupmgr restore --archive /data/backups --repo example \ 
--host couchbase://127.0.0.1 --username Administrator --password password 

Discussion

The restore command works by replaying the data recorded into backup files. During the restore this creates key-value traffic against the cluster that shows up in the form of "set" operations. The restore command replays data from each file in order to guarantee that older backup data does not overwrite newer data. The restore command uses Couchbase's conflict resolution mechanism by default to ensure this behavior. The conflict resolution mechanism can be disable by specifying the --force-updates flag when executing a restore.

Unlike backups, restores cannot be resumed if they fail.

Environment And Configuration Variables

(None)