Sync Function API

Sync Function API

Edit on GitHub

The sync function is the core API you interact with on Sync Gateway. This article explains its functionality, and how you write and configure it.

For simple applications it might be the only server-side code you need to write. For more complex applications it is still a primary touchpoint for managing data routing and access control.

The sync function is a JavaScript function whose source code is stored in the Sync Gateway's database configuration file. The sync function is called every time a new revision/update is made to a document, and the changes to channels and access made by the sync function are tied to that revision. If the document is later updated, the sync function will be called again on the new revision, and the new channel assignments and user/channel access replace the ones from the first call.

It can do the following things:

  • Validate the document: If the document has invalid contents, the sync function can throw an exception to reject it. The document won't be added to the database, and the client request will get an error response
  • Authorize the change: The sync function can call requireUser() or requireRole() to specify what user(s) are allowed to modify the document. If the user making the change isn't in that list, an exception is thrown and the update is rejected with an error. Similarly, requireAccess() requires that the user making the change have access to any of the listed channels.
  • Assign the document to channels: Based on the contents of the document, the sync function can call channel() to add the document to one or more channels. This makes it accessible to users who have access to those channels, and will cause the document to be pulled by users that are subscribed to those channels.
  • Grant users access to channels: Calling access(user, channel) grants a user access to a channel. This allows documents to act as membership lists or access-control lists.

The sync function is crucial to the security of your application. It's in charge of data validation, and of authorizing both read and write access to documents. The API is high-level and lets you do some powerful things very simply, but you do need to remain vigilant and review the function carefully to make sure that it detects threats and prevents all illegal access. The sync function should be a focus of any security review of your application.

You write your sync function in JavaScript. The basic structure of the sync function looks like this:

function (doc, oldDoc) {
    // Your code here
}

The sync function arguments are:

  • doc: An object, the content of the document that is being saved. This matches the JSON that was saved by the Couchbase Lite and replicated to Sync Gateway. The _id property contains the document ID, and the _rev property the new revision ID. If the document is being deleted, there will be a _deleted property with the value true.
  • oldDoc: If the document has been saved before, the revision that is being replaced is available in this argument. Otherwise it's null. (In the case of a document with conflicts, the current provisional winning revision is passed in oldDoc.) Your implementation of the sync function can omit the oldDoc parameter if you do not need it (JavaScript ignores extra parameters passed to a function).

The default Sync Function is documented in the API reference databases.$dbname.sync.

Validation and Authorization

The sync function API provides several methods that you can use to validate document creation, updates and deletions. For error conditions, you can simply call the built-in JavaScript throw() function. You can also enforce user access privileges by calling requireUser(), requireRole(), or requireAccess().

What happens to rejected documents? Firstly, they aren't saved to the Sync Gateway's database, so no access changes take effect. Instead an error code (usually 403 Forbidden) is returned to Couchbase Lite's replicator.

Any other exception (including implicit ones thrown by the JavaScript runtime, like array bounds exceptions) will also prevent the document update, but will cause the gateway to return an HTTP 500 "Internal Error" status.

throw()

At the most basic level, the sync function can prevent a document from persisting or syncing to any other users by calling throw() with an error object containing a forbidden: property. You enforce the validity of document structure by checking the necessary constraints and throwing an exception if they're not met.

Here is an example sync function that disallows all writes to the database it is in.

function(doc) {
   throw({forbidden: "read only!"})
}

The document update will be rejected with an HTTP 403 "Forbidden" error code, with the value of the forbidden: property being the HTTP status message. This is the preferred way to reject an update.

In validating a document, you'll often need to compare the new revision to the old one, to check for illegal changes in state. For example, some properties may be immutable after the document is created, or may be changeable only by certain users, or may only be allowed to change in certain ways. That's why the current document contents are given to the sync function, as the oldDoc parameter.

We recommend that you not create invalid documents in the first place. As much as possible, your app logic and validation function should prevent invalid documents from being created locally. The server-side sync function validation should be seen as a fail-safe and a guard against malicious access.

requireUser(username)

The requireUser() function authorizes a document update by rejecting it unless it's made by a specific user or users, as shown in the following example:

// Throw an error if username is not "snej":
requireUser("snej");

// Throw an error if username is not in the list:
requireUser(["snej", "jchris", "tleyden"]);

The function signals rejection by throwing an exception, so the rest of the sync function will not be run. All properties of the doc parameter should be considered untrusted, since this is after all the object that you're validating. This may sound obvious, but it can be easy to make mistakes, like calling requireUser(doc.owners) instead of requireUser(oldDoc.owners). When using one document property to validate another, look up that property in oldDoc, not doc!

requireRole(rolename)

The requireRole() function authorizes a document update by rejecting it unless the user making it has a specific role or roles, as shown in the following example:

// Throw an error unless the user has the "admin" role:
requireRole("admin");

// Throw an error unless the user has one or more of those roles:
requireRole(["admin", "old-timer"]);

The argument may be a single role name, or an array of role names. In the latter case, the user making the change must have one or more of the given roles.

The function signals rejection by throwing an exception, so the rest of the sync function will not be run.

requireAccess(channels)

The requireAccess() function authorizes a document update by rejecting it unless the user making it has access to at least one of the given channels, as shown in the following example:

// Throw an exception unless the user has access to read the "events" channel:
requireAccess("events");

// Throw an exception unless the user can read one of the channels in the
// previous revision's "channels" property:
if (oldDoc) {
    requireAccess(oldDoc.channels);
}

The function signals rejection by throwing an exception, so the rest of the sync function will not be run.

If a user was granted access to the star channel (noted *), a call to requireAccess('any channel name')' will fail because the user wasn't granted access to that channel (only to the * channel). To allow a user to perform a document update in this case, you can specify multiple channel names (requireAccess('any channel name', '*')')

Routing

The sync function API provides several functions that you can use to route documents. The routing functions assign documents to channels, and enable user access to channels (which will route documents in those channels to those users.)

Routing changes have no effect until the document is actually saved in the database, so if the sync function first calls channel() or access(), but then rejects the update, the channel and access changes will not occur.

channel (name)

The channel() function routes the document to the named channel(s). It accepts one or more arguments, each of which must be a channel name string, or an array of strings. The channel function can be called zero or more times from the sync function, for any document. Here is an example that routes all "published" documents to the "public" channel:

function (doc, oldDoc) {
   if (doc.published) {
      channel ("public");
   }
}

Tip: As a convenience, it is legal to call channel with a null or undefined argument; it simply does nothing. This allows you to do something like channel(doc.channels) without having to first check whether doc.channels exists.

Note: Channels don't have to be predefined. A channel implicitly comes into existence when a document is routed to it.

If the document was previously routed to a channel, but the current call to the sync function (for an updated revision) doesn't route it to that channel, the document is removed from the channel. This may cause users to lose access to that document. If that happens, the next time Couchbase Lite pulls changes from the gateway, it will receive an empty revision of the document with nothing but a "_removed": true property. (Of course the previous revisions of the document remain in your Couchbase Lite database until it's compacted.)

Read Access

access (username, channelname)

The access() function grants access to a channel to a specified user. It can be called multiple times from a sync function.

The first argument can be an array of strings, in which case each user in the array is given access. The second argument can also be an array of strings, in which case the user(s) are given access to each channel in the array. As a convenience, either argument may be null or undefined, in which case nothing happens.

If a user name begins with the prefix role:, the rest of the name is interpreted as a role rather than a user. The call then grants access to the specified channels for all users with that role.

Note: The effects of all access calls by all active documents are effectively unioned together, so if any document grants a user access to a channel, that user has access to the channel.

Caution: Revoking access to a channel can cause a user to lose access to documents, if s/he no longer has access to any channels those documents are in. However, the replicator does not currently delete such documents that have already been synced to a user's device (although future changes to those documents will not be replicated.) This is a design limitation of the Sync Gateway 1.0 that may be resolved in the future.

The following code snippets shows some valid ways to call access():

access ("jchris", "mtv");
access ("jchris", ["mtv", "mtv2", "vh1"]);
access (["snej", "jchris", "role:admin"], "vh1");
access (["snej", "jchris"], ["mtv", "mtv2", "vh1"]);
access (null, "hbo");  // no-op
access ("snej", null);  // no-op

Here is an example of a sync function that grants access to a channel for all the users listed in a document:

function (doc, oldDoc) {
    if (doc.type == "chat_room") {
        // Give members access to the chat channel this document manages:
        access (doc.members, doc.channel_name);

        // Put this document in the channel it manages:
        channel (doc.channel_name);
    }
}

role (username, rolename)

The role() function grants a user a role, indirectly giving them access to all channels granted to that role. It can also affect the user's ability to revise documents, if the access function requires role membership to validate certain types of changes. Its use is similar to access — the value of either parameter can be a string, an array of strings, or null. If the value is null, the call is a no-op.

For consistency with the access call, role names must always be prefixed with role:. An exception is thrown if a role name doesn't match this. Some examples:

role ("jchris", "role:admin");
role ("jchris", ["role:portlandians", "role:portlandians-owners"]);
role (["snej", "jchris", "traun"], "role:mobile");
role ("ed", null);  // no-op

Note: Roles, like users, have to be explicitly created by an administrator. So unlike channels, which come into existence simply by being named, you can't create new roles with a role() call. Nonexistent roles don't cause an error, but have no effect on the user's access privileges. You can create a role after the fact; as soon as a role is created, any pre-existing references to it take effect.

Expiry

expiry (value)

Calling expiry(value) from within the sync function will set the expiry value (TTL) on the document. When the expiry value is reached, the document will be purged from the database.

expiry("2018-07-06T17:00:00+01:00")

Under the hood, the expiration time is set and managed on the Couchbase Server document (TTL is not supported for databases in walrus mode). The value can be specified in two ways:

  • ISO-8601 format: for example the 6th of July 2016 at 17:00 in the BST timezone would be 2016-07-06T17:00:00+01:00;
  • as a numeric Couchbase Server expiry value: Couchbase Server expiries are specified as Unix time, and if the desired TTL is below 30 days then it can also represent an interval in seconds from the current time (for example, a value of 5 will remove the document 5 seconds after it is written to Couchbase Server). The document expiration time is returned in the response of GET /{db}/{doc} when show_exp=true is included in the querystring.

As with the existing explicit purge mechanism, this applies only to the local database; it has nothing to do with replication. This expiration time is not propagated when the document is replicated. The purge of the document does not cause it to be deleted on any other database.

If shared bucket access is enabled (introduced in Sync Gateway 1.5), the behaviour of the expiry feature changes in the following way: when the expiry value is reached, instead of getting purged, the active revision of the document is tombstoned. If there is another non-tombstoned revision for this document (i.e a conflict) it will become the active revision. The tombstoned revision will be purged when the server's metadata purge interval is reached.

Document Conflicts

If a document is in conflict there will be multiple current revisions. The default, "winning" one is the one whose channel assignments and access grants take effect.

Handling deletions

Validation checks often need to treat deletions specially, because a deletion is just a revision with a "_deleted": true property and usually nothing else. Many types of validations won't work on a deletion because of the missing properties — for example, a check for a required property, or a check that a property value doesn't change. You'll need to skip such checks if doc._deleted is true.

Example

Here's an example of a complete, useful sync function that properly validates and authorizes both new and updated documents. The requirements are:

  • Only users with the role editor may create or delete documents.
  • Every document has an immutable creator property containing the name of the user who created it.
  • Only users named in the document's (required, non-empty) writers property may make changes to a document, including deleting it.
  • Every document must also have a title and a channels property.

    function (doc, oldDoc) {
            if (doc._deleted) {
                    // Only editors with write access can delete documents:
                    requireRole("role:editor");
                    requireUser(oldDoc.writers);
                    // Skip other validation because a deletion has no other properties:
                    return;
            }
            // Required properties:
            if (!doc.title || !doc.creator || !doc.channels || !doc.writers) {
                    throw({forbidden: "Missing required properties"});
            } else if (doc.writers.length == 0) {
                    throw({forbidden: "No writers"});
            }
            if (oldDoc == null) {
                    // Only editors can create documents:
                    requireRole("role:editor");
                    // The 'creator' property must match the user creating the document:
                    requireUser(doc.creator)
            } else {
                    // Only users in the existing doc's writers list can change a document:
                    requireUser(oldDoc.writers);
                    // The "creator" property is immutable:
                    if (doc.creator != oldDoc.creator) {
                            throw({forbidden: "Can't change creator"});
                    }
            }
            // Finally, assign the document to the channels in the list:
            channel(doc.channels);
    }
    

Changing the sync function

The Sync Function computes document routing to channels and user access to channels at document write time. If the Sync Function is changed, Sync Gateway needs to reprocess all existing documents in the bucket to recalculate the routing and access assignments.

The Admin REST API has a re-sync endpoint to process every document in the database again. To update the Sync Function, it is recommended to follow the steps outlined below:

  1. Update the configuration file of the Sync Gateway instance.
  2. Restart Sync Gateway.
  3. Take the database offline using the /{db}/_offline endpoint.
  4. Call the re-sync endpoint on the Admin REST API. The message body of the response contains the number of changes that were made as a result of calling re-sync.
  5. Bring the database back online using the /{db}/_online endpoint.

This is an expensive operation because it requires every document in the database to be processed by the new function. The database can't accept any requests until this process is complete (because no user's full access privileges are known until all documents have been scanned). Therefore the Sync Function update will result in application downtime between the call to the /{db}/_offline and /{db}/_online endpoints as mentioned above.

When should you run a re-sync?

When running a re-sync operation, the context in the Sync Function is the admin user. For that reason, calling the requireUser, requireAccess and requireRole methods will always succeed. It is very likely that you are using those functions in production to govern write operations. But in a re-sync operation, all the documents are already written to the database. For that reason, it is recommended to use re-sync for changing the assignment to channels only (i.e. reads). If the modifications to the Sync Function only impact write security (and not routing/access), you won't need to run the re-sync operation.

Similarly, if you wish to change the channel/access rules, but only want those rules to apply to documents written after the change was made, then you don't need to run the re-sync operation.

If you need to ensure access to the database during the update, you can create a read-only backup of the Sync Gateway's bucket beforehand, then run a secondary Sync Gateway on the backup bucket, in read-only mode. After the update is complete, switch to the main Gateway and bucket.

In a clustered environment with multiple Sync Gateway instances sharing the load, all the instances need to share the same configuration, so they all need to be taken offline either by stopping the process or taking them offline using the /{db}/_offline endpoint. After the configuration is updated, one instance should be brought up so it can update the database — if more than one is running at this time, they'll conflict with each other. After the first instance finishes opening the database, the others can be started.