Data Compression

Data Compression

Couchbase Server supports data compression in its communications with internal and external clients, and in its internal handling of documents.

Overview

The compression of data allows RAM and disk-space to be used with increased efficiency. It may also reduce consumption of network bandwidth. Higher consumption of CPU resources may result.

Compression, provided by the open-source library Snappy, is applied to documents based both on the client's capabilities and on the compression mode established by the user for the given bucket.

Compression is available only in Couchbase Enterprise Edition, and can be applied only to Couchbase and Ephemeral buckets.

Where Compression is Used

Clients based on the Couchbase SDK (1), nodes within a cluster that participate in intra-cluster replication (4), internal Couchbase Services (5), external DCP clients (6), and remote clusters that participate in Cross Data-Center Replication (7) communicate their ability to send and receive compressed documents by using the HELO command, with a flag that confirms support of the Snappy Compression data type.

Couchbase Server may (depending on the mode of the bucket) store documents in compressed form in memory (2). The server always compresses documents when storing them on disk (3).

Compression Modes

Each bucket is configured to support one of three modes. After a client has communicated its ability to send and receive compressed data, the server's running of compression and decompression routines depends on the mode supported by the specified bucket. The modes are as follows:

  • Off: Provides the behavior of Couchbase Server pre-5.5. On receipt of a compressed document, Couchbase Server decompresses the document when storing in memory; and recompresses it when storing on disk. Couchbase Server sends the document in uncompressed form.

    This mode is assigned by default to buckets upgraded from a previous version of Couchbase Server. The mode is recommended for use where clients cannot benefit from compression, and where neither memory-resources nor network-bandwidth will be negatively impacted by the size and quantity of documents to be handled. Note also that this is the only mode under which Memcached buckets operate.

  • Passive: On receipt of a compressed document, Couchbase Server stores it in compressed form both in memory and on disk. Couchbase Server sends the document back to the client in compressed form if this is requested by the client; otherwise, it sends the document back in uncompressed form.

    On receipt of an uncompressed document, Couchbase Server stores it in memory in uncompressed form, and stores it on disk in compressed form. It returns the document to the client in uncompressed form.

    This mode is assigned by default to new buckets, in Couchbase Server 5.5 and beyond. It supports clients that themselves handle compresion, and additionally allows Couchbase Server to to limit its use of memory-resources and network-bandwidth. At the same time, it does not force Couchbase Server to use CPU resources for the compression and decompression of documents that clients do not themselves require in compressed form.

  • Active: Couchbase Server actively compresses documents for storage in memory and on disk, even if the documents are received in uncompressed form. Documents are decompressed before being sent back to those clients that do not support the receiving of compressed data. Documents are sent in compressed form to clients that do support the receiving of compressed data, even if those clients originally sent the documents to the server in uncompressed form.

    This mode allows the server to practice the maximum conservation of memory-resources and network-bandwith. Consequently, more documents can be held in memory simultaneously, and thereby accessed with improved, overall efficiency (since the processing-time required for compression and decompression is significantly less than that required for fetching data from disk). Nevertheless, in some circumstances, clients that do not themselves require the compression and decompression of documents may be negatively affected by the compression and decompression performed by the server on documents resident in memory.

Switching Between Compression Modes

Buckets can be switched between modes. The following behavior-changes should be noted:

  • When a bucket formerly in Passive mode has been switched to Off mode, any compressed document that Couchbase Server receives for that bucket is stored in memory in uncompressed form. If the server needs to send from the bucket a document that is currently compressed, the server decompresses the document before sending: however, to preserve memory-efficiency, the document remains compressed in memory.

  • When a bucket has been switched from Passive to Active mode, it periodically runs a task that compresses uncompressed documents.

  • When a bucket formerly in Active mode is switched to Off mode, this disables the task that was periodically run to compress uncompressed documents. Compressed documents can continue to be received for the bucket, and are decompressed for storage in memory. Compressed documents within the bucket are decompressed before sending: however, to ensure continued memory-efficiency, the document remains compressed in memory.

Enabling Compression

Couchbase Server allows authorized users to configure compression by means of the following.

  • The Couchbase Web Console, the Couchbase CLI, and the Couchbase REST API, which allow buckets each to be assigned a compression mode; and which allow Cross Data-Center Replication (XDCR) to handle compressed documents.

  • The cbbackup, cbrestore, and associated tools, which allow compressed documents to be requested in the backup stream.

  • The Couchbase SDK, which can elect to send and receive compressed documents.

For information on roles that allow modification of bucket-settings, see Authorization.