Getting Started

Getting Started

Once installation and configuration of the three principal components are complete, you can replicate data from Couchbase to Elasticsearch; run Elasticsearch text-based queries on the replicated Couchbase data; and use document IDs returned by Elasticsearch to initiate queries on Couchbase, thereby returning full Couchbase documents. This section explains how to accomplish these tasks both manually and programmatically.

Replicate Data from Couchbase Server to Elasticsearch

Data-replication from Couchbase Server to Elasticsearch is managed by means of the XDCR facility, accessible from the Couchbase Web Console.

Proceed as follows:

  1. Within the virtual environment, using the browser, access the Couchbase Web Console, at the localhost address of your virtual machine, on the 8091 port:

  2. Select the XDCR option, on the navigation-bar, at the left-hand side:

    This brings up the XDCR interface, as follows:

  3. Left-click on the Add Remote Cluster button, at the upper-right of the Remote Clusters panel:

    The following dialog now appears:

    Enter appropriate data into the fields of this dialog, and then left-click on the Save button. The Cluster Name can be any name of your choice, to refer to your Elasticsearch cluster. The IP/hostname should be the port-number 9091, appended to the word localhost. The Username for Remote Cluster and Password are those you have established for your Elasticsearch cluster, in the elasticsearch.yml configuration-file. You do not need to enable TLS encryption. When you have saved this data, it appears in the XDCR interface.

  4. Left-click on the Create Replication button, to the upper-right of the Ongoing Replications panel. The Add Replication dialog now appears.

    For full details of the fields provided by this dialog, see Managing XDCR. For the immediate purposes of replicating from Couchbase Server to Elasticsearch, the Bucket should be the bucket whose data you intend to replicate to Elasticsearch (previously designated as beer-sample). The Cluster should be the Elasticsearch cluster you previously created; and its Bucket should be the name of the Elasticsearch index you specified to handle data replicated from Couchbase: which was beer-sample.

    Then, left-click on Advanced settings control. When the dialog expands vertically, change the value of the XDCR Protocol to 1 (since replication from Couchbase to Elasticsearch is indeed only supported by Version 1 of the XDCR Protocol.

    Next, left-click on the Save button, at the lower-right of the dialog.

At this point, replication from Couchbase to Elasticsearch begins. This is duly represented within the Ongoing Replications panel.

Query Elasticsearch Data Manually

The simplest Elasticsearch query takes the form of a Lucene-based string; and can be dispatched as an HTTP request. For example, on the command-line, within the virtual environment, enter the following:

$ curl http://localhost:9200/beer-sample/_search?q=Classic-Special-Brew\
> +AND+North+American+Lager

This searches the Elasticsearch repository for items that each contain the two strings Classic Special Brew and North American Lager. Output takes approximately the following appearance:

{"took":35,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":156,
"max_score":1.7641015,"hits":[{"_index":"beer-sample","_type":"couchbaseDocument","_id":"aass_bre
wery-classic_special_brew","_score":1.7641015,"_source":{"meta":{"rev":"1-14987863e28400010000000
002000006","flags":33554438,"expiration":0,"id":"aass_brewery-classic_special_brew"}}},{"_index":
"beer-sample","_type":"couchbaseDocument","_id":"otter_creek_brewing_wolaver_s_organic_ales-vermo
nt_lager","_score":0.7017108,"_source":{"meta":{"rev":"1-14987864700e00000000000002000006","flags
":33554438,"expiration":0,"id":"otter_creek_brewing_wolaver_s_organic_ales-vermont_lager"}}},{"_i
ndex":"beer-sample","_type":"couchbaseDocument","_id":"blue_point_brewing-toasted_lager","_score"
:0.68838316,"_source":{"meta":{"rev":"1-14987864d3f500010000000002000006","flags":33554438,"expir
ation":0,"id":"blue_point_brewing-toasted_lager"}}},{"_index":"beer-sample","_type":"couchbaseDoc
ument","_id":"seabright_brewery-brew_ribbon","_score":0.65299296,"_source":{"meta":{"rev":"1-1498
786402b300000000000002000006","flags":33554438,"expiration":0,"id":"seabright_brewery-brew_ribbon
"}}},   .   .   .

Note that the appearance of the JSON documents here displayed can be improved by installation and use of a tool such as jq:

$ curl http://localhost:9200/beer-sample/_search?q=Classic-Special-Brew\
> +AND+North+American+Lager | jq
{
    "took": 27,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 156,
        "max_score": 1.7641015,
        "hits": [
    {
    "_index": "beer-sample",
    "_type": "couchbaseDocument",
    "_id": "aass_brewery-classic_special_brew",
    "_score": 1.7641015,
    "_source": {
        "meta": {
            "rev": "1-14987863e28400010000000002000006",
            "flags": 33554438,
            "expiration": 0,
            "id": "aass_brewery-classic_special_brew"
            }
            .
            .
            .

Alternatively, more complex forms of query can be performed by means of the Elasticsearch REST API. For example, you can use the following JSON construct:

{
    "query": {
        "query_string": {
            "query": "North American Lager AND Classic Special Brew"
        }
    }
}

For example, enter the following at the command prompt:

$ curl -XPOST 'localhost:9200/beer-sample/_search?pretty' -d'{"query": \
> {"query_string": {"query": "North American Lager AND \
> Classic Special Brew"}}}'

This produces the same output as the Lucene-based example.

For more information on Elasticsearch query-options, see Introducing the Query Language, in the Elasticsearch API documentation.

Use the Elasticsearch Web UI

The Elasticsearch Web UI is located at http://localhost:9200/_plugin/head/, and appears as follows:

The docs field indicates the number of items indexed by Elasticsearch. Note that this may be greater than the actual number of documents in Couchbase server; because XDCR and the Couchbase Plug-in send additional documents, describing the status of replication. The UI provides various options for performing Elasticsearch queries, which you can experiment with as appropriate.

Examine Elasticsearch Responses

As indicated by the output shown above, responses to Elasticsearch queries contain data on the following:

  • Search-performance. The took parameter indicates the number of milliseconds required for the search; while fields within the _shards object indicate how many Elasticsearch shards were available for search, how many were accessed successfully, and how many unsuccessfully.

  • Items matched. The total field indicates the total number of items. A max_score is provided, to indicate Elasticsearch’s estimate of the relevance of each search-hit. Note that the source object contains only metadata, rather than a document’s entire contents: this is because the contents, if and when required, can more rapidly be retrieved from Couchbase itself; using the document ID that is the value of the _id field.

Use Elasticsearch Responses to Query Couchbase Manually

The contents of the _id field, returned from an Elasticsearch query, can be used to retrieve the entire corresponding document from Couchbase. Ensure Couchbase is running; then, proceed as follows:

  1. Access the Couchbase Web Console, at http://localhost:8091:/.

  2. Left-click on the Data Buckets tab, near the top. This brings up the Couchbase Buckets screen.

  3. Left-click on the Documents button, towards the right of the beer-sample row. This brings up the Documents screen for the beer-sample bucket.

  4. In the text-field to the left of the Lookup Id button, enter the document-ID retrieved from the Elasticsearch output. Then, left-click on the Lookup Id button. This brings up the document with the specified ID.

Query Elasticsearch and Couchbase Programmatically

This section provides an example of searching Elasticsearch and Couchbase Server programmatically. A JavaScript routine within an html page makes calls on two node.js servers: one being responsible for running server-side queries on Elasticsearch; the other, on Couchbase Server. The structure is as follows:

The annotations to this diagram are as follows:

  1. The html interface allows the user to select a beer-style. On selection, a getJSON call passes the style, in the form of a key-value pair, to the node.js program esNodeJsQueryAgent.

  2. The node.js routine performs an Elasticsearch query on the existing beer-sample index: the returned documents each contain an ID corresponding to a particular beer, which is described by the specified style.

  3. The documents are passed back to the client-side, where the IDs are retrieved. Each is displayed for the user.

  4. The client passes each ID to the node.js program cbNodeJsQueryAgent.

  5. The program cbNodeJsQueryAgent duly queries Couchbase. Couchbase returns, for each ID, a document containing detailed information on the beer specified by the ID.

  6. Each document is returned to the client-side routine, which displays the results for the user.

Access Source-Files

You can access the three source-files for the example at this location: https://github.com/couchbaselabs/elasticsearchdemo/tree/master. The following sections of the current document provide a brief summary of the functionality.

Client-Side HTML and JavaScript

The file couchbaseESqueryDemo.html provides html-based interactive elements for the selection of beer-styles and the display of query-results. Beer-styles can be selected by means of a series of radio-buttons, within a dialog named availableBeerStylesDialog.

A value is associated with each possible radio-button selection. When the user left-clicks on the Query Elasticsearch button, this value is retrieved:

$("#queryElasticsearch").click(function(event)
{  
    // Get the user's radio-button selection, which corresonds to a particular
    // beer style.
    // 
    var beerStyles = document.getElementsByName("beerStyle");
    var selectedBeerValue = 0;
    var selectedBeerStyle = "";

    for (var i = 0; i < beerStyles.length; i++) 
    {
        if (beerStyles[i].checked == true) 
        {
            selectedBeerValue = beerStyles[i].value;
        }
    }

The value is then used to determine the style-name, which will be passed to Elasticsearch, and used in a search-procedure.

if (selectedBeerValue == 0)
{
    selectedBeerStyle = "American-Style Pale Ale";
} 
else 
{
    if (selectedBeerValue == 1)
    {
        selectedBeerStyle = "American-Style Brown Ale";
    }
    else
    {
        if (selectedBeerValue == 2)

A corresponding Elasticsearch query is then prepared and executed:

var esNodeJsAddress = "http://localhost:8081/";
var esNodeJsTargetURL = esNodeJsAddress + '?' + "foo=" + selectedBeerStyle;

$.getJSON(esNodeJsTargetURL, function(dataFromElasticsearch)  
{

Retrieved IDs are, first, assembled and displayed:

$.each(dataFromElasticsearch, function(key, val) 
{	
    esDataDisplayString = esDataDisplayString + '<p>' + dataFromElasticsearch[key]._id + '</p>';
    numberOfIds++;
});

document.getElementById('ElasticSearchRetrievalsContent').innerHTML = esDataDisplayString;

Then, the IDs are dispatched as an array, to cbNodeJsQueryAgent, so that Couchbase Server can be searched for each of them.

cbNodeJsTargetURL = cbNodeJsAddress + '?' + keyNameForIDparam + '=' 
+ returnedCouchbaseIDsStringed + '&' + countKey + '=' + numberOfIds;

$.getJSON(cbNodeJsTargetURL, function(dataReturnedFromCouchbase) 
{

An array of documents is returned, each of which is duly displayed:

for (var currentKeyPosition = 0; currentKeyPosition < numberOfIds; currentKeyPosition++)
{	
					
    $.each(JSON.parse(dataReturnedFromCouchbase[currentKeyPosition]), function(key, val) 
    {
        cbDataDisplayString =  '<p>' + cbDataDisplayString + "\"" + key + "\"" 
            + " : " + "\"" + val + "\"" + '</p>';
    });
}

document.getElementById('CouchbaseRetrievalsContent').innerHTML = cbDataDisplayString;

Server-Side node.js for Elasticsearch

The file esNodeJsQueryAgent.js uses the require function to add appropriate modules, including the module for the Elasticsearch client. It then creates an instance of the client:

var http = require('http');
var url = require('url');
var elasticsearch = require('elasticsearch');
var client = new elasticsearch.Client({
    host: 'localhost:9200',
    log: 'trace'
});

An http server is then created:

http.createServer(function (request, response) 
{
console.log('New connection');

The server is directed (near the end of the file) to listen on port 8081:

}).listen(8081);

The value passed to the program by couchbaseESqueryDemo is retrieved, by referencing its known key:

var queryObject = url.parse(request.url, true).query
		
var luceneString = queryObject.foo;

Then, an appropriate query-string is created, to be passed to Elasticsearch:

var clientSearchStringStart = 
    "{index: 'beer-sample', body: { query: { query_string: { query: ";
var clientSearchStringEnd = "}}}}"
var clientSearchStringFull = clientSearchStringStart + "\"" + luceneString + "\"" + clientSearchStringEnd;
var extendedLuceneString = "\"" + luceneString + "\"";

Next, the Elasticsearch query is made, on the beer-sample index:

client.search({
    index:'beer-sample',
    body: {
        query: {
            query_string: {
                query: luceneString
            }
        }
     }
 }).then(function(resp){
     hits = resp.hits.hits;
     console.log("Hits are: " + JSON.stringify(hits));

The retrieved data is then passed back to couchbaseESqueryDemo:

response.writeHead(200, {"Content-Type": "application/json", "Access-Control-Allow-Origin": "*"});
response.end(JSON.stringify(hits));

Server-Side node.js for Couchbase

Having used the required function to include modules for url and http, cbNodeJsQueryAgent creates an http server, and specifies that it will listen on port 8080.

It then parses the query-URL, and obtains the ID-array, provided by couchbaseESqueryDemo. It then prepares to access Couchbase on its default port:

var queryObject = url.parse(request.url, true).query

var couchbase = require("couchbase");
var myCluster = new couchbase.Cluster('couchbase://localhost');

// Authentication, required by 5.0 and later...
//
myCluster.authenticate('beer-sample', 'beer-sample');

var myBucket = myCluster.openBucket('beer-sample');

Note, in the code-fragment immediately above, the line used for authentication, which is required by Couchbase Server 5.0 and later. This particular example assumes that a user has been defined whose username and password are both beer-sample. It also assumes that the user has been assigned the Full Bucket Access role for the beer-sample bucket. If a pre-5.0 version of Couchbase Server is being used, this line can be commented out or deleted. See Authorization, for information on accessing server-resources by means of Role-Based Access Control.

Next, the function searchCouchbaseForNextID is called recursively, once for each ID in the array. The function itself invokes the Couchbase SDK get method, to search Couchbase Server for a single ID. (Note that this method is asynchronous.) Once all IDs have been searched for, a response containing an array of retrieved documents is provided to the client.

function searchCouchbaseForNextID(arrayOfIDs, count, totalCount)
{
    myBucket.get(arrayOfIDs[count], function(err, res)  
    {
	    couchbaseObjectArray[count] = JSON.stringify(res.value);
	
	    if (count < totalCount)
	    {
	        count++;
	        searchCouchbaseForNextID(receivedArray, count, totalCount);
	    } 
	    else 
	    {
	        response.writeHead(200, {"Content-Type": "application/json", 
                "Access-Control-Allow-Origin": "*"});
	        response.end(JSON.stringify(couchbaseObjectArray));
	    }
    });
}

Setting Up the Example

Successful running of the example requires that Couchbase Server and Elasticsearch both be already installed, configured, and running. The instructions in this section assume that all are on the same node, and that all services can thus be accessed from localhost.

Note that the node.js program esNodeJsQueryAgent.js has been written to run on port 8081; and cbNodeJsQueryAgent.js on port 8080. If you wish to change these port-designations, you must edit the program-files, including that for couchbaseESqueryDemo.html.

To run the provided node.js programs, you must install both the Couchbase SDK and node.js Elasticsearch client; which in turn requires that you install and use the Node Package Manager, npm. On non-Windows platforms, you may also need to install node-gyp.

For information on installing the node.js instance of the Couchbase SDK see the documentation at Start Using the SDK. See also the Elasticsearch documentation for installing the Elasticsearch client, at About.

When you have installed the required SDK, start cbNodeJsQueryAgent at the command-line, as follows:

$ node cbNodeJsQueryAgent.js

The message Server started is provided in response.

Start esNodeJsQueryAgent in a separate terminal, as follows:

$ node esNodeJsQueryAgent.js

A repsonse is provided, confirming that the program has been added as a connection to http://localhost:9200, which is the Elasticsearch port.

Now, bring up couchbaseESqueryDemo.html in a browser. The layout appears as follows:

The UI features three principal elements. at the upper-left, a dialog presents a series of radio-butons, permitting selection from a number of beer-styles. At the lower-left, a pane is provided for the display of IDs retrieved from Elasticsearch; at the right, a pane for the display of documents retrieved from Couchbase.

Each beer-style can be selected by its corresponding radio-button. For example:

Once a beer-style has been selected, querying can be initiated by left-clicking on the Query Elasticsearch button.

The full query-routine is duly performed: the beer-style is passed to Elasticsearch, and queried on; ID-information is returned to the client; then, ID-information is queried against Couchbase Server. Elements retrieved from both repositories are displayed in the appropriate panes:

The panes can be scrolled as needed, to reveal the full set of results.

Using Advanced Lucene Strings

In the above example, since a single beer-style was used as the basis for an Elasticsearch query, the Lucene string submitted was simply the style-name. Note that (as in the case of the curl command-line example provided earlier) more complex queries can be submitted, with the full syntactical form of the query simply submitted as the string. For example, the string American-Style Pale Ale might be replaced by American-Style Pale Ale AND American-Style Brown Ale.