Data Access

Data Access

Couchbase provides multiple ways of accessing data: using a key-value access pattern, querying data using MapReduce (views) or N1QL. You can also use Full Text Search (FTS) feature to query data.

At a high level, key-value access provides a high performance access path directly to data using the key to the item stored. For applications looking for sub millisecond latencies, key-value method provides the fastest access your data. However, not all data access can have a key in hand and that is where queries come in. Queries can be done using MapReduce views and N1QL. N1QL provides a flexible and declarative query language that is similar to SQL access provided by relational databases. N1QL is great for fast access to your data for operations such as secondary lookup on attributes in your JSON document. For example, lookup users using the user email address and not the userID that is used for the document key.

Views also provide a powerful way to index data with user defined map and reduce functions. Views can be a powerful option for complex reshaping and pre-aggregation of data. Views are the ideal solution for interactive reporting over your data.

Key-Value Operations

At the heart of Couchbase is the distributed key-value (KV) store. A KV store is an extremely simple, schema-less approach to data management that, as the name implies, stores a unique ID (key) together with a piece of arbitrary information (value); it may be thought of as a hash map or dictionary. The KV store itself can accept any data, whether it be a binary blob or a JSON document, and Couchbase features such as N1QL and MapReduce make use of the KV store’s ability to process JSON documents.

Due to their simplicity, KV operations execute with extremely low latency, often sub-millisecond. While the Query service is accessed by a defined query language (N1QL), the KV store is accessed using simple CRUD (Create, Read, Update, Delete) APIs, and thus provides a simpler interface when accessing documents using their IDs (primary keys).

The KV store contains the authoritative, most up-to-date state for each item. In order to perform better, query and MapReduce services provide eventually consistent indexes that, by default, use a version of the data that is potentially slightly out-of-date. However, they can instead elect to wait briefly to make sure they have had a chance to update before responding to a query. By contrast, querying the KV store directly will always access the latest version of data.

While N1QL provides a richer query interface, applications will use the KV store when speed, consistency, and simplified access patterns are preferred over flexible query options.

All KV operations are atomic, which means that Read and Update are individual operations. In order to avoid conflicts that might arise with multiple concurrent updates to the same document, applications may make use of Compare-And-Swap (CAS), which is a per-document checksum that Couchbase modifies each time a document is changed.

MapReduce Views

Developers can write custom JavaScript MapReduce programs to specify complex indexing and aggregation of items stored in Couchbase. MapReduce is a programming model for distributed data processing on highly parallelizable data: the map function reads all documents across the cluster, filters them to select the relevant information, and then emits the results; the reduce function sorts and aggregates the results. MapReduce is most useful for highly customized data processing on large input sets.

MapReduce data processing is incremental, so the output continues to update as the underlying data undergoes mutations.

Programmers can also write Spatial MapReduce programs that operate on geometric data in the form of GeoJSON, n-dimensional numeric data (hyper-cubes), or a combination of the two. This can be used to handle queries about geometries, for example, to return a list of all items within a particular bounding box.

MapReduce programs output either MapReduce Views or Spatial Views, which are described further in Indexing.

Querying with N1QL

Couchbase Server can be programmed using SQL. Given that nearly all programmers already know SQL, most developers will be able to get started quickly on Couchbase. And since most organizations already have significant amounts of SQL code, Couchbase Server fits into the technology landscape easily. Support for SQL by JDBC and ODBC drivers opens the ecosystem of tools for analytics and data integration such as Microsoft Excel and Tableau.

N1QL (pronounced "nickel") is the first query language that leverages the complete flexibility of JSON with the full power of SQL. Developed by Couchbase for use with the Couchbase Server, N1QL provides a common query language and JSON-based data model for distributed document-oriented databases.

Couchbase created its own SQL dialect called N1QL in order to give developers and enterprises an expressive, powerful, and complete language for querying, transforming, and manipulating JSON data. The N1QL query engine is optimized for modern, highly parallel multi-core execution. N1QL has special extensions that allow it to deal with documents with variable and/or nested structures. Just as SQL operates on rows, columns and tables of an RDBMS and returns rows and columns to the application, N1QL operates on JSON and returns JSON to the application.

N1QL provides a rich set of features that let you retrieve, manipulate, transform, and create JSON document data. The key features include:

  • SELECT Statement: The SELECT statement in N1QL extends the functionality of the SQL SELECT statement to work with JSON documents. You can use your knowledge of SQL to work with the powerful NoSQL features of Couchbase, which in turn enable you to work with big data, big data users, and cloud computing.
  • Data Manipulation Language (DML): N1QL provides the DELETE, INSERT, UPDATE, and UPSERT statements to create, delete, and modify the data stored in JSON documents. You can perform these operations by specifying and executing simple commands.
  • Indexes: N1QL provides the CREATE and DROP INDEX statements to easily create and delete indexes.
  • Primary Key Access: Data in the Couchbase Server can be stored in key-value pairs, providing primary key access to any data.
  • Aggregation: N1QL provides the standard SQL aggregation operators such as MIN, MAX, COUNT, as well as the grouping operators, the GROUP BY clause, and the group filter, HAVING.
  • Joins: N1QL lets you retrieve data from multiple documents by specifying joins in the FROM clause.
  • Nesting: Nests enable you to retrieve sub-records for a record. That is, for each left hand input, the matching right hand inputs are collected into an array, which is then embedded in the result. For example, you can nest each item in a customer’s order. Nests can be performed on multiple documents.
  • Unnesting: Unnesting is the opposite of nesting. The UNNEST function extracts nested sub-records from a record, making each extraction a separate record. Note that unnesting can be performed on a single document.
  • Subqueries: N1QL supports multi-part queries through subqueries. You can use a subquery with the SELECT statement or nest it within another subquery, thereby refining the results of the preceding queries.
  • UNION, INTERSECT, and EXCEPT: N1QL provides the UNION, INTERSECT, and EXCEPT operations to compare two or more queries. UNION combines the results of two or more queries into a single result set that includes all the rows that belong to all queries in the union. EXCEPT returns any distinct values from the left query that are not found on the right query. INTERSECT returns any distinct values that are returned by the queries on both sides of the INTERSECT operand.

For more information on how to query using N1QL, see Querying with N1QL. For the complete language and REST API reference, see N1QL Language Reference and the N1QL REST API.

Full text search

Couchbase Server performs search queries using Full Text Search Reference (Developer Preview), an integrated full text search engine. With Couchbase Full Text Search, you, as a developer, can easily add full-text search capabilities to your application without deploying additional components, which reduces operational complexity. Alternatively, if you are using external search engines such as Elasticsearch or Lucidworks, you can leverage the available connectors to continuously replicate data from the Couchbase Server cluster to the search engines.