cbbackupmgr-merge - Merges two or more backups together.
cbbackupmgr merge [--archive <archive_dir>] [--repo <repo_name>] [--start <backup>] [--end <backup>]
The merge command is used in order to merge two or more backups together. Since cbbackupmgr is a utility that always takes incremental backups it is necessary to reclaim disk space from time to time. Merging data de-duplicates similar keys in backup files being merged together in order to create a single smaller backup file. Doing merges should replace the full backup step by taking multiple incremental backups of a Couchbase cluster and converting them into a single full backup. Since this process takes place in the backup archive there is no cluster overhead to merge data together. See Enterprise Backup Strategies for suggestions on using the merge command in your backup process.
Below are a list of required parameters for the merge command.Required
- The archive directory to merge data in.
- The name of the backup repository to merge data in.
- The name of the first backup to be merged.
- The name of the last backup to be merged.
In order to merge data, you need to have a backup repository with at least two backups. Below is an example of merging a backup repository named "example" that has two backups in it. The first backup contains the initial dataset. The second backup was taken after four items were updated.
$ cbbackupmgr list --archive /data/backups Size Items Name 148.70MB - / 148.70MB - + example 98.66MB - + 2016-03-01T16_27_10.093782029-08_00 98.66MB - + travel-sample 300B 0 bucket-config.json 98.66MB 31592 + data 98.66MB 31592 shard_0.fdb 2B 0 full-text.json 4B 0 gsi.json 1.72KB 1 views.json 50.04MB - + 2016-03-01T16_27_51.349151165-08_00 50.04MB - + travel-sample 300B 0 bucket-config.json 50.04MB 4 + data 50.04MB 4 shard_0.fdb 2B 0 full-text.json 4B 0 gsi.json 1.72KB 1 views.json $ cbbackupmgr merge --archive /tmp/backup --repo example \ --start 2016-03-01T16_27_10.093782029-08_00 \ --end 2016-03-01T16_27_51.349151165-08_00 $ cbbackupmgr list --archive /tmp/backup Size Items Name 98.84MB - / 98.84MB - + example 98.84MB - + 2016-03-01T16_27_51.349151165-08_00 98.84MB - + travel-sample 300B 0 bucket-config.json 98.84MB 31592 + data 98.84MB 31592 shard_0.fdb 2B 0 full-text.json 4B 0 gsi.json 1.72KB 1 views.json
Upon completion of the merge the number of items in the backup files is the same. This is because the keys in the second backup were also contained in the first backup, but the keys in the second backup contained newer values and overwrote the keys in the first backup during the merge. The timestamp of the backup folder is also the same as the timestamp of the latest backup because the new backup is a snapshot of the cluster at that point in time.
It is important that internally the merge command is able to merge backups together without corrupting the backup archive or leaving it in an intermediate state. In order to ensure this behavior cbbackupmgr always creates a new backup and completely merges all data before removing any backup files. When a merge is started a .merge_status file is created in the backup repository to track the merge progress. cbbackupmgr then copies the first backup to the .merge folder and begins merging the other backups into .merge folder. After each backup is merged the .merge_status file is updated to track the merge progress. If all backups are merged together successfully, cbbackupmgr starts deleting the old backups and then copies the fully merged backup into a folder containing the same name as the backup specified by the --end flag. If the cbbackupmgr utility fails during this process, then the merge is either completed or the partial merge files are removed from the backup repository during the next invocation of the cbbackupmgr.
Since the merge command creates a new backup file before it removes the old ones it is necessary to have at least as much free space as the backups that are to be merge together.
For more information on suggestions for how to use the merge command in your backup process see Enterprise Backup Strategies
Environment And Configuration Variables
- File storing information on the progress of the merge.
- Directory storing intermediate merge data.