casstat

By Denis Sheahan, Netflix Inc.

Introduction

Casstat is a command line tool that monitors various aspects of Cassandra performance. It is runs on the Cassandra node itself and is modeled on the classic Unix tools like vmstat and mpstat. It is designed to be always on, running in the background redirecting it's output to a file. It has the ability to tag each line of output with an Epoch or timestamp so the data can be correlated with an event

Note casstat was designed to run on EC2 instances in the cloud

Design

Casstat is a bash script that calls other tools at regular intervals, compares the data with its previous sample and normalizes it on a per second basis. The tools it uses are

nodetool cfstats to get Column Family performance data
nodetool tpstats to get internal state changes
nodetool cfhistograms to get 95th and 99th percentile response times
nodetool compactionstats to get details on number and type of compactions
iostat to get disk and cpu performance data
ifconfig to calculate network bandwidth
The cassandra log for interesting events

casstat assumes these tools are in the PATH

Deployment

Casstat is simple to deploy. Copy the script to the node of interest, and ensure the cassandra nodetool binary is in the path. We usually create a small script - shown below - which redirects the output to a named file in /tmp

#!/bin/bash
export PATH=/<path to cassandra>/bin:$PATH
/tmp/casstat -e -k Membership -c Customer -x -p -s -d <path to cassandra log>/system.log  -o md0 -f eth0 > /tmp/`hostname`.casstat &

Sections of casstat output

casstat has a number of default sections that are always emmited - see the options section for on turning out optional extra stats.

The first section is Reads and Writes per second on this node and the average latency in milliseconds of these transactions. Note this is average since cluster boot time. It also provides 99th and 95th percentiles as well, see below. In this example there are only reads occurring on this node with an average latency of 0.1 - 0.2ms

Epoch      Rds/s   RdLat   Wrts/s  WrtLat
1320279185 553     0.219    0      0.000
1320279197 203     0.219    0      0.000
1320279208 244     0.177    0      0.000
1320279220 330     0.101    0      0.000
1320279232 567     0.196    0      0.000
1320279243 882     0.215    0      0.000
1320279255 666     0.183    0      0.000

The second section is the breakdown of cpu time spent. There is percent user, system,idle, iowait and steal

%user   %system %idle   %iowait %steal
 0.93     0.36  98.56    0.00    0.15
 1.16     0.34  98.35    0.00    0.15
 0.84     0.39  98.59    0.00    0.19   
 1.69     0.55  97.57    0.00    0.19
 1.52     0.46  97.84    0.01    0.16
 1.19     0.44  98.21    0.00    0.16

The third section is the disk statistics on this node. The disk to dump is specified with the -o option, this can be a stripe such as md0 or individual disks, only 1 is dumped. If no disk is specified then "---" will be emmited for these fields.

md0r/s  w/s     rMB/s   wMB/s
 0.00   0.40    0.00    0.00
 0.00   0.50    0.00    0.00
 0.00   0.00    0.00    0.00
 0.00   0.10    0.00    0.00
 0.00   0.30    0.00    0.00
 0.00   0.80    0.00    0.00
 0.00   0.30    0.00    0.00

The final default section is Network Bandwidth in Kilo-bits per second. The network interface is specified using the -f option. If no network interface is specified then "---" will be emmited for these fields.

NetRxKb NetTxKb
23866   51476
23058   36835
19896   35025

Options

casstat -h for command line options

casstat -h
usage: /tmp/casstat options
 
OPTIONS:
   -e             Include Epoch per line
   -t             Include Timestamp per line
   -d <file location> Location of log for scraping
   -l <filename>  Dump log events to a seperate file
   -i <interval>  Set interval, default 10
   -k <keyspace>  Keyspace to monitor
   -c <column family>  Column Family to monitor
   -o <disk name> Stripe or disk to monitor
   -f <network name> Network interface to monitor
   -x             Dont dump headers
   -r             Read Repair stats
   -p             Compaction stats
   -s             Response time percentiles - must specify CF
   -j             Keycache and Read stage stats
   -n <count>     Number of samples to take

An explanation of each option is as follows

-k <keyspace>: This parameter is required and specifies the Keyspace you wish to monitor. casstat will exit if -k is not set

-d: Location of log for scraping. This must be specified or the logs will not be used

-o: Disk to dump the iostat statistics. Can be a stripe such as md0 or a individual disk such as sdb

-f: Network interface to dump stats for

-i <interval> : By default casstat samples at a rate of 10 seconds. Use this option to change the sampling interval. The minimum allowed is 2 seconds

-e: Prepends each line of output with the current epoch value. This is useful in correlating data from multiple nodes together and avoiding date conversion

   Epoch       Rds/s   RdLat   Wrts/s  .....
   1320249291  138 62.997  1620    ....
   1320249301  129 94.332  1804    ....

-t: Prepends a timestamp to each line of output. This is more human readable but harder to post-process

Time        Rds/s RdLat Wrts/s WrtLat %user %system %idle %iowait %steal md0r/s w/s rMB/s wMB/s NetRxKb NetTxKb
23:51:08      0   0.000     0  0.000  0.78     0.03 99.20  0.00    0.00   0.00 0.90 0.00  0.00     129     23

-l <log_name>: casstat scrapes the cassandra logs for interesting events. For instance when a compaction completes you get the output below. The -l option lets you redirect this output to a different file and reduces the clutter in the casstat output

   23:20:44,945 <Compaction Completed>

-c <Column Family> : Specifies the Column family to monitor. If none is specified then stats for the whole Keyspace will be displayed

-x: By default casstat will dump a banner in the output every 10 samples (see example below). This option supresses this output

Epoch       Rds/s   RdLat   Wrts/s  WrtLat  %user   %system %idle   %iowait %steal  md0r/s  w/s rMB/s   wMB/s   NetRxKb NetTxKb

-n <number> : By default casstat runs for ever, this parameter limits the number of samples emmited and then casstat will terminate

-r : This parameter adds a column indicating the number of times a thread entered the Read Repair stage. This value can be useful if the read-repair-chance parameter is non zero for the column family and you need an indication of the number of rows that need repair (for instance after a node restore).

....    NetRxKb NetTxKb RdRep
....      333     19      0
....      45     269      0

-p: This parameter dumps statistics about the current executing and pending compactions. The first field is always Pen - the number pending. After that is a variable number of fields depending on the number of active compactions. Compaction types can be one of

Min - Minor
Maj - Major
Val - Validation

In the example below there are two active minor compactions and one finishes after 10 seconds

 .... Pen/4 Min/58%  Min/99%
 .... Pen/2 Min/59%
 .... Pen/2 Min/62%

-s: This parameter dumps the 99th and 95th response times for both reads and writes in the last sample period. If there were no transactions these numbers will be 0.00

Percentiles     Read            Write                   Compacts

99th  0.642 ms 95th  0.446 ms 99th 0.00 ms 95th 0.00 ms Pen/0
99th  0.642 ms 95th  0.446 ms 99th 0.00 ms 95th 0.00 ms Pen/0
99th  0.535 ms 95th  0.446 ms 99th 0.00 ms 95th 0.00 ms Pen/0

-j: This option dumps information on the Key Cache hit rate and the number of times threads were in the Read Stage or waiting to do so

Examples

Simplest form, just give Cassandra read and write stats and system utilization. No disk or network stats

/tmp/casstat -k Membership

Add an epoch to each line. Monitor CF Customer in Keyspace Membership. Calculate the 95th and 99th response times and dump the number of compactions. Display a sample every 5 seconds but supress the banner every 10 samples. The Cassandra log is also specified as is the disk to monitor and the network interface

/tmp/casstat -i 5 -e -k Membership -c Customer -x -p -s -d /mnt/cass/logs/system.log  -o md0 -f eth0

A Netflix Original Production
Tech Blog | Twitter @NetflixOSS | Jobs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly