Global Secondary Indexes

Global Secondary Indexes

Global Secondary Indexes (GSI) support a variety of OLTP like use-cases for N1QL including basic, ad-hoc, and short-running reporting queries that require filtering. For example, if you have a WHERE clause in a N1QL statement that selects a subset of your data on which the secondary index is defined, you can see a significant speedup in the N1QL query performance by using GSI.

Global secondary indexes are deployed independently into a separate index service within the Couchbase Server cluster, away from the data service that processes key based operations. GSIs provide the following advantages:
  • Predictable Performance: Core key based operations maintain predictable low latency even in the presence of a large number of indexes. Index maintenance does not compete with key based operations even under heavy mutations to data.
  • Low Latency Queries: GSIs independently partition into the index service nodes and don’t have to follow hash partitioning of data into vBuckets. Queries using GSIs can achieve low latency response times even when the cluster scales out because GSIs don’t require a wide fan-out to all data service nodes.
  • Advanced Scaling Model: GSI can be placed onto independent set of nodes. Administrators can add new indexes and evolve the application performance without stealing cycles from the incoming workload.

Creating Global Secondary Indexes

You can define a primary or secondary index using GSIs in N1QL using the CREATE INDEX statement and the USING GSI clause. For more information on the syntax and examples, see CREATE INDEX statement.

Placement of Global Secondary Indexes

GSIs reside on the index nodes in the cluster. Each index service node can host multiple indexes and a single index can be hosted on multiple index nodes. Every index has an index key(s) (used for lookup). When the index type is not primary index, index can have an index filter with the WHERE clause.

CREATE INDEX index_name ON ( index_key1, ..., index_keyN) 
    WHERE index_filter 
    WITH { "nodes": [ "node1:8091" ] } 

Based on the index filter, the index can be partitioned across multiple index service nodes and placed on given nodes using the WITH clause and "nodes" argument in CREATE INDEX statement.

Administrators can place each index partition in a separate node to distribute index maintenance and index scan load. Index metadata stored on the index node knows about the distribution of the index. GSI does not use scatter-gather. Instead, based on the index metadata, it only touches the nodes that have the relevant data.