-
Notifications
You must be signed in to change notification settings - Fork 65
casstat
By Denis Sheahan, Netflix Inc.
Casstat is a command line tool that monitors various aspects of Cassandra performance. It is runs on the Cassandra node itself and is modeled on the classic Unix tools like vmstat and mpstat. It is designed to be always on, running in the background redirecting it's output to a file. It has the ability to tag each line of output with an Epoch or timestamp so the data can be correlated with an event
Note casstat was designed to run on EC2 instances in the cloud
Casstat is a bash script that calls other tools at regular intervals, compares the data with its previous sample and normalizes it on a per second basis. The tools it uses are
-
nodetool cfstats to get Column Family performance data
-
nodetool tpstats to get internal state changes
-
nodetool cfhistograms to get 95th and 99th percentile response times
-
nodetool compactionstats to get details on number and type of compactions
-
iostat to get disk and cpu performance data
-
ifconfig to calculate network bandwidth
-
The cassandra log for interesting events
casstat assumes these tools are in the PATH
Casstat is simple to deploy. Copy the script to the node of interest, and ensure the cassandra nodetool binary is in the path. We usually create a small script - shown below - which redirects the output to a named file in /tmp
#!/bin/bash
export PATH=/<path to cassandra>/bin:$PATH
/tmp/casstat -e -k Membership -c Customer -x -p -s -d <path to cassandra log>/system.log -o md0 -f eth0 > /tmp/`hostname`.casstat &
casstat has a number of default sections that are always emmited - see the options section for on turning out optional extra stats.
The first section is Reads and Writes per second on this node and the average latency in milliseconds of these transactions. Note this is average since cluster boot time. It also provides 99th and 95th percentiles as well, see below. In this example there are only reads occurring on this node with an average latency of 0.1 - 0.2ms
Epoch Rds/s RdLat Wrts/s WrtLat
1320279185 553 0.219 0 0.000
1320279197 203 0.219 0 0.000
1320279208 244 0.177 0 0.000
1320279220 330 0.101 0 0.000
1320279232 567 0.196 0 0.000
1320279243 882 0.215 0 0.000
1320279255 666 0.183 0 0.000
The second section is the breakdown of cpu time spent. There is percent user, system,idle, iowait and steal
%user %system %idle %iowait %steal
0.93 0.36 98.56 0.00 0.15
1.16 0.34 98.35 0.00 0.15
0.84 0.39 98.59 0.00 0.19
1.69 0.55 97.57 0.00 0.19
1.52 0.46 97.84 0.01 0.16
1.19 0.44 98.21 0.00 0.16
The third section is the disk statistics on this node. The disk to dump is specified with the -o option, this can be a stripe such as md0 or individual disks, only 1 is dumped. If no disk is specified then "---" will be emmited for these fields.
md0r/s w/s rMB/s wMB/s
0.00 0.40 0.00 0.00
0.00 0.50 0.00 0.00
0.00 0.00 0.00 0.00
0.00 0.10 0.00 0.00
0.00 0.30 0.00 0.00
0.00 0.80 0.00 0.00
0.00 0.30 0.00 0.00
The final default section is Network Bandwidth in Kilo-bits per second. The network interface is specified using the -f option. If no network interface is specified then "---" will be emmited for these fields.
NetRxKb NetTxKb
23866 51476
23058 36835
19896 35025
casstat -h for command line options
casstat -h
usage: /tmp/casstat options
OPTIONS:
-e Include Epoch per line
-t Include Timestamp per line
-d <file location> Location of log for scraping
-l <filename> Dump log events to a seperate file
-i <interval> Set interval, default 10
-k <keyspace> Keyspace to monitor
-c <column family> Column Family to monitor
-o <disk name> Stripe or disk to monitor
-f <network name> Network interface to monitor
-x Dont dump headers
-r Read Repair stats
-p Compaction stats
-s Response time percentiles - must specify CF
-j Keycache and Read stage stats
-n <count> Number of samples to take
An explanation of each option is as follows
-k <keyspace>
: This parameter is required and specifies the Keyspace you wish to monitor. casstat will exit if -k is not set
-d: Location of log for scraping. This must be specified or the logs will not be used
-o: Disk to dump the iostat statistics. Can be a stripe such as md0 or a individual disk such as sdb
-f: Network interface to dump stats for
-i <interval>
: By default casstat samples at a rate of 10 seconds. Use this option to change the sampling interval. The minimum allowed is 2 seconds
-e: Prepends each line of output with the current epoch value. This is useful in correlating data from multiple nodes together and avoiding date conversion
Epoch Rds/s RdLat Wrts/s .....
1320249291 138 62.997 1620 ....
1320249301 129 94.332 1804 ....
-t: Prepends a timestamp to each line of output. This is more human readable but harder to post-process
Time Rds/s RdLat Wrts/s WrtLat %user %system %idle %iowait %steal md0r/s w/s rMB/s wMB/s NetRxKb NetTxKb
23:51:08 0 0.000 0 0.000 0.78 0.03 99.20 0.00 0.00 0.00 0.90 0.00 0.00 129 23
-l <log_name>
: casstat scrapes the cassandra logs for interesting events. For instance when a compaction completes you get the output below. The -l option lets you redirect this output to a different file and reduces the clutter in the casstat output
23:20:44,945 <Compaction Completed>
-c <Column Family>
: Specifies the Column family to monitor. If none is specified then stats for the whole Keyspace will be displayed
-x: By default casstat will dump a banner in the output every 10 samples (see example below). This option supresses this output
Epoch Rds/s RdLat Wrts/s WrtLat %user %system %idle %iowait %steal md0r/s w/s rMB/s wMB/s NetRxKb NetTxKb
-n <number>
: By default casstat runs for ever, this parameter limits the number of samples emmited and then casstat will terminate
-r : This parameter adds a column indicating the number of times a thread entered the Read Repair stage. This value can be useful if the read-repair-chance parameter is non zero for the column family and you need an indication of the number of rows that need repair (for instance after a node restore).
.... NetRxKb NetTxKb RdRep
.... 333 19 0
.... 45 269 0
-p: This parameter dumps statistics about the current executing and pending compactions. The first field is always Pen - the number pending. After that is a variable number of fields depending on the number of active compactions. Compaction types can be one of
Min - Minor
Maj - Major
Val - Validation
In the example below there are two active minor compactions and one finishes after 10 seconds
.... Pen/4 Min/58% Min/99%
.... Pen/2 Min/59%
.... Pen/2 Min/62%
-s: This parameter dumps the 99th and 95th response times for both reads and writes in the last sample period. If there were no transactions these numbers will be 0.00
Percentiles Read Write Compacts
99th 0.642 ms 95th 0.446 ms 99th 0.00 ms 95th 0.00 ms Pen/0
99th 0.642 ms 95th 0.446 ms 99th 0.00 ms 95th 0.00 ms Pen/0
99th 0.535 ms 95th 0.446 ms 99th 0.00 ms 95th 0.00 ms Pen/0
-j: This option dumps information on the Key Cache hit rate and the number of times threads were in the Read Stage or waiting to do so
Simplest form, just give Cassandra read and write stats and system utilization. No disk or network stats
/tmp/casstat -k Membership
Add an epoch to each line. Monitor CF Customer in Keyspace Membership. Calculate the 95th and 99th response times and dump the number of compactions. Display a sample every 5 seconds but supress the banner every 10 samples. The Cassandra log is also specified as is the disk to monitor and the network interface
/tmp/casstat -i 5 -e -k Membership -c Customer -x -p -s -d /mnt/cass/logs/system.log -o md0 -f eth0
A Netflix Original Production
Tech Blog | Twitter @NetflixOSS | Jobs