Bucket Disk Storage

Bucket Disk Storage

When storing data in a Couchbase bucket, the server first writes data to the caching layer and eventually stores all data to disk to provide a higher level of reliability.

The Couchbase Server first writes data to the caching layer and puts the data into a disk write queue to be persisted to disk. Disk persistence enables you to perform backup and restore operations and to grow your datasets larger than the built-in caching layer. This disk storage process is called eventual persistence because the server does not block a client while it writes to disk.

If a node fails and all data in the caching layer is lost, the items can be recovered from disk. When the server identifies an item that needs to be loaded from disk because it is not in active memory, it places it in a load queue. A background process processes the load queue and reads the information back from disk and into memory. The client waits until the data is loaded back into memory before returning the information.

Multiple readers and writers
Multithreaded readers and writers provide simultaneous read and write operations for data on disk. Simultaneous reads and writes increase I/O throughput. The multithreaded engine includes additional synchronization among threads that are accessing the same data cache to avoid conflicts. To maintain performance while avoiding conflicts over data, Couchbase Server uses a form of locking between threads and thread allocation among vBuckets with static partitioning.
When Couchbase Server creates multiple reader and writer threads, the server assesses a range of vBuckets for each thread and assigns each thread exclusively to certain vBuckets. With this static thread coordination, the server schedules threads so that only a single reader and single writer thread can access the same vBucket at any given time. The following diagram shows six pre-allocated threads and two data buckets. Each thread has the range of vBuckets that is statically partitioned for read and write access.
Figure 1. Bucket disk storage

Item deletion
Items can be deleted explicitly by the client applications or deleted using an expiration flag. Couchbase Server never deletes items from disk unless one of these operations are performed. However, after deletion or expiration, a tombstone is maintained as the record of deletion. Tombstones help communicate the deletion or the expiration to downstream components. Once all downstream components have been notified, the tombstone gets purged as well.
Tombstone purging
Tombstones are records of expired or deleted items that include item keys and metadata.

Couchbase Server and other distributed databases maintain tombstones in order to provide eventual consistency between nodes and between clusters. Tombstones are records of expired or deleted items and they include the key for the item and metadata. Couchbase Server stores the key plus several bytes of metadata per deleted item in two structures per node. With millions of mutations, the space taken up by tombstones can grow quickly. This is especially the case if there are a large number of deletions or expired documents.

The Metadata Purge Interval sets frequency for a node to permanently purge metadata of deleted and expired items. The Metadata Purge Interval setting runs as part of auto-compaction. This helps reduce the storage requirement by roughly 3x times than before and also frees up space much faster.