Upgrade a Couchbase Server cluster in a manner that is best suited to your operations.
Couchbase Server can be upgraded in multiple ways each with their own pros and cons. Which way is best for your cluster will depend on a few factors in your architecture and company organization (e.g. SLAs). On this and subsequent pages of this section we will discuss not just the mechanics of an upgrade, but why each may or may work best for you.
Option #1 - Rolling Online Upgrade
A rolling online upgrade is the recommended upgrade process for a Couchbase Server cluster and should be chosen unless you have special circumstances that would rule it out. This method avoids downtime for the database and applications because all operations continue normally and high availability can be maintained. Rolling upgrades use Couchbase's rebalance capability to redistribute the data amongst the nodes of a cluster, keeping it available for the duration of the upgrade. Prior to version 5.0, special care should be taken when upgrading nodes running the index service.
Couchbase Server is specifically designed to provide fully online upgrades. If you find this is not the case, please open a support ticket or report a bug.
There are three options for rolling online upgrades:
- Swap Rebalance
- This method entails introducing new nodes into a Couchbase Server cluster as you remove an equal number of nodes to be upgraded. It uses a feature called Swap Rebalance to move data efficiently from the existing nodes to the new nodes, without involving other nodes in the cluster.
- As long as you swap an equal number of nodes (e.g. one for one, two for two, etc.), a Swap Rebalance will be triggered. This also keeps the cluster capacity consistent so as to not interfere with the load running on the cluster. While this method is the safest and provides the most availability, it may require multiple rebalances and therefore be longer as compared to other upgrade options. If the speed of an upgrade is a primary concern for your cluster, see Graceful Failover or Performing the Offline Upgrade.
- Remove and Rebalance
- This method is suitable when you must complete the upgrade using only the nodes currently in the cluster but want to maintain High Availability during the upgrade process. Since this will reduce the capacity of the cluster, it is important to ensure that there is enough spare capacity across all necessary resources (disk, CPU, RAM, etc) during the upgrade process. This process involves removing one running node from the cluster and rebalancing. That node can then be upgraded and used in the Swap Rebalance upgrade procedure above. When the last node has been upgraded, it can be rebalanced back into the cluster to return to full capacity.
- Like a swap rebalance upgrade, this style will require multiple rebalances to complete.
- Graceful Failover and Delta Recovery
- This option involves performing a rolling online upgrade using a Graceful Failover followed by Delta Recovery instead of the full addition and removal of nodes in a Swap Rebalance. It is typically faster and less resource intensive because data does not need to be completely moved between nodes, rather the replicas are synchronized and activated during the failover and the data resynchronized when the node returns following the upgrade. Another advantage compared to the other online upgrades is that this method preserves the global secondary indexes and doesn’t need to rebuild them.
- The primary downside to this option is decreased high availability as replicas are used for faster failover and not recreated until the node is returned. This option is not available when choosing to upgrade with net-new systems (as in the case of many cloud deployments) since those new nodes would not have the previous nodes’ data in place. Use Option #1 when upgrading with net-new systems.
Option #2 - Upgrade Using XDCR
For this option, another Couchbase Server cluster is created (or already exists) and connected via Cross Datacenter Replication (XDCR). The application is transitioned to use one cluster while the other(s) is/are upgraded. While this upgrade process is relatively straightforward to set up, it requires more investment in servers/instances and networking, as well as a change to the application so that it can switch between clusters. It is best used when an existing XDCR connection is already in place, the application needs a live rollback option or the original cluster cannot be upgraded online for some reason.
Keep in mind that XDCR is eventually consistent between clusters and so care should be taken when switching the application from one to the other.
For the fastest upgrade, the unused cluster(s) can be upgraded following the offline upgrade steps below (if not installed anew)
Option #3 - Offline Upgrade
Choose an offline upgrade when the situation calls for an easy and fast upgrade method as well as when the database can incur a controlled outage. The offline upgrade is more likely to succeed in situations where an online upgrade option might fail, but also the rare time a cluster is unstable and has been determined that a Couchbase Server upgrade will fix a specific issue.
This procedure involves upgrading one or more nodes without removing them from the cluster. In some cases the whole cluster may be shut down, upgraded and restarted.
It is recommended to disable auto-failover before using this method and to re-enable it once complete.
Choosing the Upgrade Strategy
Both the online and offline upgrade processes have trade-offs. The following table illustrates some important aspects of the two upgrade strategies.
|Feature||Online upgrade||XDCR upgrade||Offline upgrade|
|Applications remain available||Yes||Yes||No|
|Cluster stays in operation||Yes||No||No|
|Cluster must be shut down||No||Yes||Yes|
|Typical steps||Rebalance, upgrade, rebalance||Switch to XDCR cluster, upgrade, switch back||Upgrade one or more nodes without removing from cluster.|