About document expiration

About document expiration

Time to live (TTL) is the amount of time until a document expires in Couchbase Server. By default, all documents have a TTL of zero, which indicates the document is kept indefinitely. Typically when you add, set, or replace information, you establish a custom TTL by passing it as a parameter to your method call. As part of normal maintenance operations, Couchbase Server periodically removes all items with expiration times that have passed.

Depending on the amount of time you want to specify for the document to live, you provide a TTL value as a relative number of seconds into the future or in Unix time . Unix time represents a specific date and time expressed as the number of seconds that have elapsed since Thursday, 1 January 1970 at 00:00:00 Coordinated Universal Time (UTC) . For example, the value 1421454149 represents Saturday, 17 January 2015 at 00:22:29 UTC.

If you specify a TTL as a relative value of seconds into the future, it is actually stored in Couchbase Server as an absolute Unix time stamp. This means, for example, if you store an item with a two-day relative TTL, immediately make a backup, and then restore from that backup three days later, the expiration will have passed and the data is no longer there.

Important: Some of the SDKs, most notably the Java and .NET SDKs, provide convenience methods to set the TTL value. These convenience methods handle the translation logic for TTL values under and over 30 days. This topic describes how TTL values are handled internally by Couchbase Server. Refer to the developer guide and API reference for your particular SDK to learn how to set the TTL. Where available, always use the methods provided by the SDK you are using. They take precedence over the details provided below and are much easier to use.

Here is how to specify a TTL:

  • To set a value of 30 days or less : If you want an item to live for less than 30 days, you can provide a TTL in seconds or as Unix time. The maximum value you can specify in seconds is the number of seconds in a month, namely 30 x 24 x 60 x 60. Couchbase Server removes the item the given number of seconds after it stores the item.
  • To set a value over 30 days : If you want an item to live for more than 30 days, you must provide a TTL in Unix time.

Couchbase Server does lazy expiration, that is, expired items are flagged as deleted rather than being immediately erased. Couchbase Server has a maintenance process, called the expiry pager that periodically looks through all information and erases expired items. This maintenance process runs every 60 minutes, but it can be configured to run at a different interval. Couchbase Server immediately removes an item flagged for deletion the next time the item is requested and the server responds to the requesting process with a message indicating that the item does not exist.

Couchbase Server offers functionality you can use to index and find documents and perform calculations on data, known as views . For views, you write functions in JavaScript that specify what data should be included in an index. When you want to retrieve information using views, it is called querying a view and the information Couchbase Server returns is called a result set .

The result set from a view will contain any items stored on disk that meet the requirements of your view function. Therefore, information that has not yet been removed from disk might appear as part of a result set when you query a view.

Using Couchbase views, you can also perform reduce functions on data, which perform calculations or other aggregations of data. For instance, if you want to count the instances of a type of object, you would use a reduce function. Once again, if an item is on disk, it will be included in any calculation performed by your reduce functions. Based on this behavior due to disk persistence, here are guidelines on handling expiration with views:

  • Detecting Expired Documents in Result Sets : If you are using views for indexing items from Couchbase Server, items that have not yet been removed as part of the expiry pager maintenance process will be part of a result set returned by querying the view. To exclude these items from a result set you should use query parameter include_doc set to true . This parameter typically includes all JSON documents associated with the keys in a result set. For example, if you use the parameter include_docs=true , Couchbase Server will return a result set with an additional doc object that contains the JSON or binary data for that key:

    {
       "total_rows":2,
       "rows":[
          {
             "id":"test",
             "key":"test",
             "value":null,
             "doc":{
                "meta":{
                   "id":"test",
                   "rev":"4-0000003f04e86b040000000000000000",
                   "expiration":0,
                   "flags":0
                },
                "json":{
                   "testkey":"testvalue"
                }
             }
          },
          {
             "id":"test2",
             "key":"test2",
             "value":null,
             "doc":{
                "meta":{
                   "id":"test2",
                   "rev":"3-0000004134bd596f50bce37d00000000",
                   "expiration":1354556285,
                   "flags":0
                },
                "json":{
                   "testkey":"testvalue"
                }
             }
          }
       ]
    }
    

    For expired documents, if you set include_doc=true , Couchbase Server returns a result set indicating the document does not exist anymore. Specifically, the key that had expired but had not yet been removed by the cleanup process will appear in the result set as a row where "doc":null :

    {
       "total_rows":2,
       "rows":[
          {
             "id":"test",
             "key":"test",
             "value":null,
             "doc":{
                "meta":{
                   "id":"test",
                   "rev":"4-0000003f04e86b040000000000000000",
                   "expiration":0,
                   "flags":0
                },
                "json":{
                   "testkey":"testvalue"
                }
             }
          },
          {
             "id":"test2",
             "key":"test2",
             "value":null,
             "doc":null
          }
       ]
    }
    
  • Reduces and Expired Documents : In some cases, you might want to perform a reduce function to perform aggregations and calculations on data in Couchbase Server. In this case, Couchbase Server takes precalculated values that are stored for an index and derives a final result. This also means that any expired items still on disk will be part of the reduction. This may not be an issue for your final result if the ratio of expired items is proportionately low compared to other items. For instance, if you have 10 expired scores still on disk for an average performed over 1 million players, there might be only a minimal level of difference in the final result. However, if you have 10 expired scores on disk for an average performed over 20 players, you would get very different result than the average you expect.

    In this case, you might want to run the expiry pager process more frequently to ensure that items that have expired are not included in calculations used in the reduce function. We recommend an interval of 10 minutes for the expiry pager on each node of a cluster. Note that this interval has a slight impact on node performance as it performs cleanup more frequently on the node.

For more details about setting intervals for the maintenance process, see cbepctl command line tool to find information about specifying disk cleanup intervals and refer to the examples about exp_pager_stime .

For more information about views and view query parameters, see Finding data with views .