Querying Data with Views

Querying Data with Views

View query requests come in through port 8092 to one of the nodes running the data service. The communication happens over the REST API exposed by the view query engine.

When using the SDK, subsequent requests that come in through port 8092 are load balanced across all the nodes running the data service to execute view queries.
Note: If you use the REST API directly to send requests, the requests are sent to the destination URL and are not load balanced automatically.

The view keys returned by the emit() function can be used when querying a view as the selection mechanism, either by using an explicit view key, a list of view keys, or a range of view keys. View query parameters also provide a number of parameters that can be used to express consistency requirements for the query, and limit, order, and group the view data in various ways.

The following diagram shows the flow of the execution for a view query.
Figure 1. Execution flow for a View query

When a view query is issued, one of the data service nodes receives it and becomes the coordinator node that is responsible to execute the query. Given that the partitioning of the view is based on document keys and views are queries based on view keys, the query gets scattered to all nodes by the coordinator. In the scatter phase, each node running the data service is asked to execute the query on their local view partition. The results from each node running the data service is then gathered by the coordinator node. The coordinator node compiles the final results, applying the required parameters again to limit, skip, and sort the output for final delivery to the client.

Query Consistency with Views

Mutations that arrive at a bucket do not block and wait for view indexing for the mutation to complete. View updates are triggered by the view queries or the automatic view updater. That is, views are eventually consistent with ongoing mutations.

When querying views, as a developer, you may require varying consistency levels. Consistency levels for any given query can be configured using the staleness parameter for the query. The following consistency levels can be specified for the staleness parameter:
  • stale=ok

    This level returns the query with the lowest latency as it is the most relaxed consistency level. Selecting this option essentially means the query can return data that is currently indexed and accessible by the view. The query output can be arbitrarily out-of-date if there are many pending mutations that have not been indexed by the view. This consistency level is useful for queries that favor low latency and do not need precise and most up-to-date information.

  • stale=update_after

    This level returns the query with a low latency. However, subsequent queries may be slow as this level requires a view refresh immediately after the query execution completes. This option only impacts the staleness of subsequent queries to be more up-to-date, up to the timestamp of the running view query. However, it does not provide any additional firm consistency guarantees for the running view query.

  • stale=false

    This level provides the strictest consistency level and thus executes with higher latencies than the other levels. This consistency level requires all mutations, up to the moment of the query request, to be processed before the query execution can start. This ensures that any writes that are done prior to issuing the query request, and maybe more recent mutations, are indexed by the view indexer, and will be returned by the view query if it qualifies for the resultset. This guarantee is important to applications that require consistent reads or read-your-own-write semantics.