cbrestore

cbrestore

The cbrestore tool restores data from a file to an entire cluster or a single bucket in the cluster.

Syntax

The basic syntax is:

cbrestore [options] [backup-dir] [destination]

Where:

  • [options]: Command options for cbrestore.
  • [backup-dir]: The backup directory for the source data created by cbbackup when performing the backup.
  • [destination]: This is a bucket in an existing cluster used as the destination bucket for the restored information. If you are restoring data to a single node in a cluster, provide the hostname and port of the node, If you are restoring an entire bucket, provide the URL of one of the nodes within the cluster. Be sure to create the destination bucket before restoring the data.

Description

Items that had been written to file on disk are restored to RAM.

The cbbackup, cbrestore, and cbtransfer tools do not communicate with external IP addresses for server nodes outside of a cluster. Backup, restore, or transfer operations are performed on data from a node within a Couchbase cluster. They only communicate with nodes from a node list obtained within a cluster. If you install Couchbase Server with a default IP address, you cannot use an external host name to access it.

Restoring Global Secondary Indexes

If the backup being restored contains global secondary index definitions (stored in index.json), then these are also restored as part of the cbrestore process. Unlike both data and map-reduce views, global secondary indexes are not automatically distributed across a cluster. As a result, many users manually distribute their indexes across separate index nodes to ensure high availability (see CREATE INDEX).

Due to the topology-agnostic nature of cbbackup, these index definitions are stored without the explicit node names, which are passed in as part of the CREATE INDEX command. cbbackup does however store the indexer-id for each index, a value used to internally identify a specific index node. In order to restore indexes to a cluster, cbrestore sends each index definition, along with this identifier, to the index service which will appropriately distribute these indexes across all index nodes. When you restore indexes to the same cluster (and same topology) that the backup was taken from, the indexing service is able to locate the appropriate index nodes so that the indexes are on the same nodes as before. Often the topology of a cluster at the time of backup is not the same as the topology when restoring. In such a case, the indexing service may be unable to find the specific node associated with an index which it is restoring. The behavior when the specific index node cannot be found is for the index service to distribute the index definition to another index node, chosen at random. This random distribution can cause indexes with the same definitions to be placed on the same node rather than separate nodes. You should always check that the indexes have been correctly distributed after running a restore, you may need to manually redistribute some indexes if necessary.

Once the index definitions have been restored by cbrestore they must be built manually (see BUILD INDEX) as the index service only creates them (without building them). This is to prevent resource contention caused by simultaneous data restoration and building of indexes during the restore process. Additionally this gives the Couchbase administrator a chance to examine and redistribute the indexes before they are built.

You can check whether or not an index is in a ‘created’, ‘built’ or ‘building’ state by viewing the ‘Index’ tab in the web console.

Conflict Resolution

By default, when restoring from a backup using cbrestore, Couchbase Server will perform conflict resolution on all documents to be restored. The conflict resolution behavior is the same as cross datacenter replication (XDCR), which is detailed in XDCR architecture. This is so that documents which have been updated after the backup are not incorrectly overwritten by the restore process.

As a result, after running the restore process, some documents may not match the content of the backup file. In certain cases where a document has been recently deleted on the cluster, the document may not be restored at all due to this conflict resolution.

To restore the contents of a backup while overwriting current documents with the same key, pass the extra parameter conflict_resolve=1 as part of the cbrestore command. To ensure that only the documents contained in the backup exist in the bucket after performing the restore, flush the bucket prior to performing the restore

Tool Location

The tool is in the following locations:

Operating system Location
Linux /opt/couchbase/bin/cbrestore
Windows C:\Program Files\Couchbase\Server\bin\cbrestore
Mac OS X /Applications/Couchbase Server.app/Contents/Resources/couchbase-core/bin/cbrestore
Important: Be sure to create the destination bucket before restoring the data.

Options

The following are the command options:

Table 1. cbrestore options
Option Description
-h, --help Command line help.
-a, --add Used to not overwrite existing items in the destination. Use add instead of set.
-b BUCKET_SOURCE, --bucket-source=BUCKET_SOURCE Single named bucket from the backup directory to restore. If the backup directory only contains a single bucket, then that bucket is automatically used.
-B BUCKET_DESTINATION, --bucket-destination=BUCKET_DESTINATION When --bucket-source is specified, overrides the destination bucket name. This allows you to restore to a different bucket and defaults to the same as the bucket-source.
from-date=FROM_DATE Restore data from the date specified as yyyy-mm-dd *. By default, all data from the very beginning is restored.

The updated tool adds new options to support partial restore operations. The tool still supports the existing options for full restores.

to-date=TO_DATE Restore data until the date specified as yyyy-mm-dd*. By default, all data collected is restored. The updated tool adds new options to support partial restore operations. The tool still supports the existing options for full restores.
-i ID, --id=ID Transfer only items that match a vBucket ID.
-k KEY, --key=KEY Transfer only items with keys that match a regexp.
-m MODE, --mode=MODE  
-n, --dry-run No actual transfer. Just validate parameters, files, connectivity, and configurations.
-u USERNAME, --username=USERNAME REST user name for source cluster or server node.
-p PASSWORD, --password=PASSWORD REST password for cluster or server node.
--single-node Use a single server node from the source only, not all server nodes from the entire cluster. This single server node is defined by the source URL.
-t THREADS, --threads=THREADS Number of concurrent workers threads performing the transfer.
-v, --verbose Verbose logging. More v's provide more verbosity. Max: -vvv.
-x EXTRA, --extra=EXTRA Provide extra, uncommon configuration parameters. Comma-separated key=val(key-val)* pairs.
* The format of the DATE specification is YYYY-MM-DD, where:
YYYY
4-digit year
MM
2-digit month
DD
2-digit day

The following are extra, specialized command options with the cbrestore -x parameter.

Table 2. cbrestore -x options
-x option Description
backoff_cap=10 Maximum back-off time during the rebalance period.
batch_max_bytes=400000 Transfer this # of bytes per batch.
batch_max_size=1000 Transfer this # of documents per batch.
cbb_max_mb=100000 Split backup file on destination cluster if it exceeds the MB.
conflict_resolve=1 By default, disable conflict resolution.
data_only=0 For value 1, transfer only data from a backup file or cluster.
design_doc_only=0 The documents are restored from a backup file (created with the cbbackup) tool. For value 1, transfer only design documents from a backup file or cluster. Default: 0.
max_retry=10 Max number of sequential retries if the transfer fails.
mcd_compatible=1 For value 0, display extended fields for stdout output.
nmv_retry=1 0 or 1, where 1 retries transfer after a NOT_MY_VBUCKET message. Default: 1.
recv_min_bytes=4096 Amount of bytes for every TCP/IP batch transferred.
rehash=0 For value 1, rehash the partition IDs of each item as it is required when transferring data between clusters with the different number of partitions; for example, when transferring data from an Mac OS X server to a non-Mac OS X cluster.
report=5 Number batches transferred before updating the progress bar in the console.
report_full=2000 Number batches transferred before emitting the progress information in the console.
seqno=0 By default, start seqno from beginning.
try_xwm=1 Transfer documents with metadata. Default: 1. The value of 0 is used only when transferring from 1.8.x to 1.8.x.
uncompress=0 For value 1, restore data in uncompressed mode.

Examples

The following are basic syntax examples:

cbrestore ~/backup http://192.0.2.0:8091 -b travel-sample
cbrestore ~/backup couchbase://192.0.2.0:8091 -b travel-sample
cbrestore ~/backup memcached://192.0.2.0:11211 -b travel-sample		

Example for restoring only design documents

The following example restores design documents from the backup file, ~/backup/travel-sample, to the destination bucket, travel-sample, in a cluster.

cbrestore ~/backup http://192.0.2.0:8091 -x design_doc_only=1 \ 
    -b travel-sample -B travel-sample
If multiple source buckets were backed up, this command must be performed multiple times. In the following example, a cluster with two data buckets is backed up and has the following backup files:
  • ~/backup/travel-sample/design.json
  • ~/backup/beer-sample/design.jsonT

The following command restores the design documents in both backup files to a bucket in a cluster named my_bucket.

cbrestore -b travel-sample -B travel-sample -x design_doc_only=1 \
    ~/backup http://192.0.2.0:8091    

cbrestore -b beer-sample -B beer-sample -x design_doc_only=1 \
    ~/backup http://192.0.2.0:8091

The following example response shows a successful restore.

transfer design doc only. bucket msgs will be skipped.
done     

Example for restoring data incrementally

The following example requests a restoration of data backed up between August 1, 2014, and August 3, 2014. The -b option specifies the name of the bucket to restore from the backup file, and the -B option specifies the name of the destination bucket in the cluster.

cbrestore -b travel-sample -B travel-sample \
    --from-date=2014-08-01 --to-date=2014-08-03 ~/backup \
    http://192.0.2.0:8091

Example for restoring data while ignoring conflict resolution

The following example requests a restoration of data while ignoring the built-in conflict resolution. This causes all documents in the backup file to be restored regardless of whether the documents in the target cluster would win conflict resolution.

cbrestore -b travel-sample -B travel-sample -x conflict_resolve=0 \
    ~/backup http://192.0.2.0:8091