Couchbase Server neither enforces nor validates for any particular document structure. Below are the design choices that impact JSON document design.
The document ID is the primary identifier of a document in the database. Multiple data sets are expected to share a common bucket in Couchbase. To ensure each data set has an isolated keyspace, it is a best practice to include a type/class/use-case/sub-domain prefix in all document keys. As an example of a User Model, you might have a property called "userId": 123
, the document key might look like user:123
, user_123
, or user::123
. Every Document ID is a combination of two or more parts/values, that should be delimited by a character such as a colon or an underscore. Pick a delimiter, and be consistent throughout your enterprise.
Just as each Document ID should contain a prefix of the type/model, it is also a best practice to include that same value in the body of the document. This allows for efficient filtering by document type at query time or filtered XDCR replications. This property can be named many different names: type
, docType
, _type
, and _class
are all common choices, choose one that fits your organization's standards.
{
"_type": "user",
"userId": 123
At a minimum, every JSON document should contain a type and version property. Depending on your application requirements, use case, the line of business, etc. other common properties to consider at:
_created
- A timestamp of when the document was created in epoch time (milliseconds or seconds if millisecond precision is not required)_createdBy
- A user ID/name of the person or application that created the document_modified
- A timestamp of when the document was last modified in epoch time (milliseconds or seconds if millisecond precision is not required)_modifiedBy
- A user ID/name of the person or application that modified the document_accessed
- A timestamp of when the document was last accessed in epoch time (milliseconds or seconds if millisecond precision is not required)_geo
- A 2 character ISO code of a countryThe use of a leading _
creates a standardized approach to global attributes across all documents within the enterprise.
{
"_type": "user",
"_schema": "1.2",
"_created": 1544734688923
"userId": 123
}
The same can be applied through a top-level property i.e. "meta": {}
.
{
"meta": {
"type": "user",
"schema": "1.2",
"created": 1544734688923
},
"userId": 123
}
Choose an approach that works within your organization and be consistent throughout your applications.
{% hint style="info" %}
Note: There is not a right or wrong property name, however, if you're application will leverage Couchbase Mobile (in particular Sync-Gateway), the use of a leading underscore should be avoided, as any document that contains root level properties with a leading underscore will fail to replicate. This is not a bug, and it meant to facilitate backward compatibility with v1.0 of the replication protocol.
{% endhint %}
geoCode vs countryCode
userName vs usyslogintxt
firstName or first_name or firstname
, but pick one."phones": [ ... ], "address": { ... }, "genre": " ... ", "scale": 2.3
.user, bucket, cluster, role, select, insert
etc., Please refer N1QL Reserved Word for more details on how to escape reserved words in N1QL.first_name vs first-name
.The query model changes based on the choice of having a single root attribute or the type
attribute embedded. Lets take a look at the track
document as an example.
Root attribute is a single, top-level attribute with all other attributes encapsulated as an object value of the root attribute. In the below example, the root element of the JSON document is track
.
{
"track": {
"artist": "Paul Lekakis",
"created": "2015-08-18T19:57:07",
"genre": "Hi-NRG",
"id": "3305311F4A0FAAFEABD001D324906748B18FB24A",
"mp3": "https://goo.gl/KgKoR7",
"ver": "1.0",
"ratings": [
{
"created": "2015-08-20T12:24:44",
"rating": 4,
"username": "sublimatingraga37014"
},
{
"created": "2015-08-21T09:23:57",
"rating": 4,
"username": "untillableshowings34122"
}
],
"title": "My House",
"modified": "2015-08-18T19:57:07"
}
}
In this example, the JSON document is in a flat structure but there is an attribute called type
embedded within the document.
{
"artist": "Paul Lekakis",
"created": "2015-08-18T19:57:07",
"genre": "Hi-NRG",
"id": "3305311F4A0FAAFEABD001D324906748B18FB24A",
"mp3": "https://goo.gl/KgKoR7",
"ver": "1.0",
"ratings": [
{
"created": "2015-08-20T12:24:44",
"rating": 4,
"username": "sublimatingraga37014"
},
{
"created": "2015-08-21T09:23:57",
"rating": 4,
"username": "untillableshowings34122"
}
],
"title": "My House",
"modified": "2015-08-18T19:57:07",
"_type": "track"
}
This is the recommended approach since we can use the type
field to create index.
CREATE INDEX cb2_type ON couchmusic2(_type);
SELECT COUNT(*) AS count
FROM couchmusic2
WHERE _type = "track"
GROUP BY genre;
There are two different ways to represent objects.
phones
is an object in the userProfile
class.{
"type": "userProfile",
"created": "2015-01-28T13:50:56",
"dateOfBirth": "1986-06-09",
"email": "andy.bowman@games.com",
"firstName": "Andy",
"gender": "male",
"lastName": "Bowman",
"phones": {
"number": "212-771-1834",
"type": "cell"
},
"pwd": "636f6c6f7261646f",
"status": "active",
"title": "Mr",
"updated": "2015-08-25T10:29:16",
"username": "copilotmarks61569"
}
phones
is an array of objects in the userProfile
class.{
"type": "userProfile",
"created": "2015-01-28T13:50:56",
"dateOfBirth": "1986-06-09",
"email": "andy.bowman@games.com",
"firstName": "Andy",
"gender": "male",
"lastName": "Bowman",
"phones": [
{
"number": "212-771-1834",
"type": "cell"
}
],
"pwd": "636f6c6f7261646f",
"status": "active",
"title": "Mr",
"updated": "2015-08-25T10:29:16",
"username": "copilotmarks61569"
}
Array values may be simple or object.
{
"created": "2014-12-04T03:36:18",
"id": "003c6f65-641a-4c9a-8e5e-41c947086cae",
"name": "Eclectic Summer Mix",
"owner": "copilotmarks61569",
"type": "playlist",
"tracks": [
"9FFAF88C1C3550245A19CE3BD91D3DC0BE616778",
"3305311F4A0FAAFEABD001D324906748B18FB24A",
"0EB4939F29669774A19B276E60F0E7B47E7EAF58"
],
"updated": "2015-09-11T10:39:40"
}
{
"created": "2014-12-04T03:36:18",
"id": "003c6f65-641a-4c9a-8e5e-41c947086cae",
"name": "Eclectic Summer Mix",
"owner": "copilotmarks61569",
"type": "playlist",
"tracks": [
{
"id": "9FFAF88C1C3550245A19CE3BD91D3DC0BE616778",
"title": "Buddha Nature",
"artist": "Deuter",
"genre": "Experimental Electronic"
},
{
"id": "3305311F4A0FAAFEABD001D324906748B18FB24A",
"title": "Bluebird Canyon Stomp",
"artist": "Beaver & Krause",
"genre": "Experimental Electronic"
}
],
"updated": "2015-09-11T10:39:40"
}
Working with Timestamp format is the difficult thing when it comes to JSON, since JSON does not have a standardized date format. Dates are commonly stored as string in JSON.
The following are examples of commonly used date formats.
{
"countryCode": "US",
"type": "country",
"gdp": 53548,
"name": "United States of America",
"region": "Americas",
"region-number": 21,
"sub-region": "Northern America",
"updated": "2010-07-15T15:34:27"
}
{
"countryCode": "US",
"type": "country",
"gdp": 53548,
"name": "United States of America",
"region": "Americas",
"region-number": 21,
"sub-region": "Northern America",
"updated": [ 2010, 7, 15, 15, 34, 27 ]
}
{
"countryCode": "US",
"type": "country",
"gdp": 53548,
"name": "United States of America",
"region": "Americas",
"region-number": 21,
"sub-region": "Northern America",
"updated": 1279208067000
}
It is important to understand that JSON supports optional properties. If a property has a null value, consider dropping it from the JSON unless there's a good reason not to. N1QL makes it easy to test for missing or null property values. Be sure your application code handles the case where a property value is missing.
Fields may have a value
SELECT geocode WHERE geocode IS VALUED
{
"geocode": "USA"
}
Fields may have no value
SELECT geocode WHERE geocode IS NOT VALUED
{
"geocode": ""
}
Fields may be missing
SELECT geocode WHERE geocode IS [NOT] MISSING
{
}
Fields may be explicitly null
SELECT geocode WHERE geocode IS [NOT] NULL
{
"geocode": null
}