Ongoing monitoring and maintenance
There are a number of different statistics used to monitor a cluster and diagnose and identify problems.
To understand how the cluster is working and whether it is working effectively, use the following statistics:
- Memory Used ( mem_used ) - The current size of memory used. If mem_used hits the RAM quota, you will get OOM_ERROR . The mem_used must be less than ep_mem_high_wat , which is the mark where data is ejected from RAM.
- Disk Write Queue Size ( ep_queue_size ) - The amount of data waiting to be written to disk.
- Cache Hits ( get_hits ) - As a rule of thumb, this must be at least 90% of the total requests.
- Cache Misses ( ep_bg_fetched ) - Ideally this must be low, and certainly lower than get_hits . Increasing or high values indicate that the data your application expects to be stored is not in memory.
- No document available ( get_misses ) - Couchbase Server does not have the document.
Another key statistic to monitor cluster performance is a water mark , which determines when it is necessary to start freeing up available memory. Two important statistics related to water marks include:
- High Water Mark ( ep_mem_high_wat ) - The system starts ejecting data out of memory when this water mark is met. Ejected values need to be fetched from disk when accessed before being returned to the client.
- Low Water Mark ( ep_mem_low_wat ) - When the low water mark threshold is reached, it indicates that memory usage is moving toward a critical point and system administration action must be taken before the high water mark is reached.
shell> cbstats IP:11210 all | \ egrep "todo|ep_queue_size|_eject|mem|max_data|hits|misses"
The following statistic information is provided:
ep_flusher_todo: ep_max_data_size: ep_mem_high_wat: ep_mem_low_wat: ep_num_eject_failures: ep_num_value_ejects: ep_queue_size: mem_used: ep_bg_fetched: get_hits: