Data Modeling Basics

Data Modeling Basics

Couchbase Server is a multi-model database that natively manipulates data in key-value form or in JSON documents. Unlike relational databases which require strict predefined schema with tables and columns, Couchbase Server requires no predefined schema. Each JSON document represents its own schema and can specify its own set of attributes per document.

With Couchbase Server, applications can define their schema at the application layer and freely evolve their schema with no added administrative action. Schema changes do not require scheduling downtime nor do they trigger long running, offline updates to existing data.

Couchbase Server supports both document and key-value data natively. Document-based data representation allows complex objects to be natively represented with full fidelity in Couchbase Server. Key-value representation can be mixed with document-centric design and is typically used to optimize access for extreme low latencies.

Relational Data Modeling vs Document Oriented Data Modeling

In a relational database system you must define a schema before adding records to a database. The schema is the structure described in a formal language supported by the database and provides a blueprint for the tables in a database and the relationships between tables of data. Within a table, you need to define constraints in terms of rows and named columns as well as the type of data that can be stored in each column.

In contrast, a document-oriented database contains documents, which are records that describe the "schema" of the data in the document, as well as the actual data. Documents can be as complex as you choose; you can use nested data to provide additional sub-categories of information about your object. You can also use one or more documents to represent a real-world object. The following example compares a conventional table with document-based objects.

Figure 1. 1-to-many relationship comparison between relational table and JSON documents

In this example we have a table that represents "hotels" and their respective attributes: name, address, city, and so forth. The relational model conforms to a schema with a specified number of fields which represent a specific purpose and data type. There is also a one-to-many relationship that is normalized to the "hotel reviews" table which also contains its own attributes from date of the review to the ratings. Normalization is common practice in the relational model that requires you to decompose data into smaller related tables

The equivalent document-based model has an individual document per "hotel". Each document contains the information for a specific "hotel" and embeds the "hotel reviews" in the document as an array.

In a document-oriented model, data objects are stored as documents; each document stores your data and enables you to update the data or delete it. Instead of columns with names and data types, you describe the data in the document, and provide the value for that description. If you want to add attributes to a "hotels" or "hotel reviews" table in a relational model, you need to modify the database schema to include the additional columns and their data types. In the case of document-based data, you just add the additional key-value pairs representing the new fields to your document.

Using JSON Documents

JavaScript Object Notation (JSON) is a lightweight data-interchange format which is easy to read and change. JSON is language-independent although it uses similar constructs to JavaScript. JSON documents enable you to benefit from all new Couchbase features, such as indexing and querying; they also to provide a logical structure for more complex data and enable you to provide logical connections between different records.

JSON supports the following basic data types:
  • Numbers, including integer and floating point
  • Strings, including all unicode characters and backslash escape characters
  • Boolean: true or false
  • Arrays, enclosed in square brackets: ["one", "two", "three"]
  • Objects, consisting of key-value pairs, and also known as an associative array or hash. The key must be a string and the value can be any supported JSON data type.

For more information on creating valid JSON documents, see http://www.json.org.

When you use JSON documents to represent your application data, you should think about the document as a logical container for information. This involves thinking about how data from your application fits into natural groups. It also requires thinking about the information you want to manage in your application. Data modeling for Couchbase Server is similar to the process that you would follow for traditional relational databases; however, there is much more flexibility and you can change your data structures later on.

A sample entity may be a "hotel" you represent for a travel web site that provides information on many hotels. The "hotel" entity may contain many attributes such as name and address, and may contain embedded structured like reviews that can represent individual reviews users have submitted for a given hotel.
{  
   "type":"hotel",
   "address":"Capstone Road, ME7 3JE",
   "city":"Medway",
   "country":"United Kingdom",
   "hotel_name":"Medway Youth Hostel",
   "reviews":[  
      {  
         "author":"author::Ozella_Sipes",
         "content":"This was our 2nd trip here and we enjoyed it as much or more than last year. ",
         "date":"2013-06-22 18:33:50 +0300",
         "ratings":{  
            "Cleanliness":5,
            "Location":4,
            "Overall":4,
            "Rooms":3,
            "Service":5,
            "Value":4
         }
      },
      {  
         "author_id":"author::Barton_Marks",
         "content":"We found the hotel de la Monnaie through Interval and we thought we'd give it a try while we attended a conference in New Orleans. This place was a perfect location and it definitely beat staying downtown at the Hilton with the rest of the attendees",
         "date":"2015-03-02 19:56:13 +0300",
         "ratings":{  
            "Business service (e.g., internet access)":4,
            "Check in / front desk":4,
            "Cleanliness":4,
            "Location":4,
            "Overall":4,
            "Rooms":3,
            "Service":3,
            "Value":5
         }
      }
   ]
},

In traditional relational database modeling, you would create tables that contain a subset of information for an entity. Just like the example above, a hotel may contain many reviews which are stored in a separate table and referenced by the hotel_id. In case of JSON documents, you use key-values pairs, or even nested key-value pairs and don’t have to break "hotels" and "hotel reviews" apart.

Given the boundary-less nature of the document model, there are several considerations to have in mind when you design your JSON document:
  • Whether you want to use a type field at the highest level of your JSON document in order to group and filter object types, for instance the above example includes a "type" attribute that has the value "hotel".
  • What particular keys, ids, prefixes or conventions do you want to use for items, for instance ‘Hotel::Medway_Youth_Hostel.’
  • If want to use a document to access other documents. In other words, you can store keys that refer other documents in a JSON document and get the keys through this document. For instance, the above example includes author_id under reviews that refer to the detailed information on authors.