Cassandra

Installation

Adapt your System Profile

For instrumentation of the Cassandra client API, you have to inject the Java Agent, and the Cassandra Sensor must be placed in those Agent Groups that make calls to Cassandra. The Sensor needs to be explicitly placed in new Agent Groups as shown in the following figure.

Cassandra Sensor Placement

Cassandra client

For analysis of your Cassandra client code, instrument your application by injecting the Agent (see Java Agent Configuration).

Cassandra server

To trace Cassandra calls from client to server, add a new Agent Group for the Cassandra server node. Make sure that this Agent Group has the Cassandra Sensor placed. As Cassandra does a lot of threading, you may want to disable the Thread Activation Sensor to reduce the amount of data collected. However, in order to be able to see Cassandra CQL3 protocol based statement execution, the Executor Tagging sensor needs to be active.

Then add the AppMon Agent parameter to the JVM_OPTS in the Cassandra server configuration file and restart your nodes.

Usage

Transaction Flow

In the Transaction Flow dashlet, Cassandra calls are visualized like database calls. In contrast to traditional databases, Cassandra usually runs in a cluster (meaning multiple server nodes). Consequently, multiple database nodes appear with the name of the Keyspace that you execute against and the host/port.

  • Execution time contribution
  • Number of calls (in response time mode)
  • Number of calls/min (in topology mode)

When you hover over a Cassandra database node, you can also look at the specific Cassandra calls. You can do the same by drilling down from a particular JVM to the database.

Typical Cassandra Transaction Flow

Database

In the Database dashlet, Cassandra calls based on the Thrift protocol are displayed in a SQL-like form. Cassandra calls, based on the native CQL3 protocol, directly visualize CQL statements. Each CassandraNode/Keyspace combination is visualized as a separate Database Pool, making it easy to check. For suspicious statements, the actual execution and the attached details can be reviewed.

Database Hotspots

The Database Hotspots dashlet shows where your Cassandra calls originate.

PurePaths - thrift based

The PurePath shows each Cassandra call by its client method. The Argument column shows details like Keyspace, ColumnFamily, and ConsistencyLevel.

Cassandra PurePaths

Statement details (right-click on a statement to view details) provide further information like the number of manipulated rows (row count).

Cassandra Statement Details

Depending on the executed Cassandra method, this value contains (among others):

Operation Comment Example Visualization in Database Dashlet
batch_mutate Number of mutations sent to the server batch_mutate USING QUORUM
get number of returned rows (0 to 1) get FROM TravelKeyspace.Journey USING QUORUM
get_count returned value get_count FROM TravelKeyspace.Journey USING QUORUM
get_range_slices number of returned slices get_range_slices FROM TravelKeyspace.JourneyTags USING QUORUM
get_slice number of returned columns get_slice FROM TravelKeyspace.Journey USING QUORUM
multiget_count size of returned map multiget_count FROM TravelKeyspace.Journey USING QUORUM
multiget_size size of returned map multiget_slice FROM TravelKeyspace.JourneyTags USING QUORUM

PurePaths - CQL3 binary protocol-based

Depending on the injected agents, client- and/or server-side communication is visualized on the PurePath.

If both sides (client and server) are instrumented, the client method node start the PurePath on which the database communication takes place (client-side Cassandra CQL3 protocol calls do not automatically start PurePaths). The following figure shows a sample PurePath with client and server instrumented.

CQL3 PurePath - Client and Server Instrumented

The PurePath nodes marked with 2 and 3 specify the CQL3 protocol communication endpoints: methods where CQL3 protocol messages are sent from the client (2 - Connection.write) and received on the server (3 - Message$Dispatcher.messageReceived). All CQL3 protocol messages (such as server startup and statement execution) can be traced using this mechanism.

CQL3 statements may be processed within a Session (node marked with 1) or may be sent directly to Cassandra using Connection.write (nodes marked with 2). CQL statement data is available on both the client (on node Connection.write - 2) and server side (on the respective Statement.execute node - 4). However, the available data on the client slightly differs from that on the server side. The following table shows what data is available on the client and the server and what data is used for further processing in the database dashlet:

Data Client (1) Server (1) Database dashlet uses data from
Connection Pool    
Database Host: X X Server
Database Name: may not be set X (2) Server
Database Type: Cassandra by default Cassandra by default Server
Database Pool Name: default default Client
Database Pool Size: X - Client
Database User (default): - X Server
Database Details      
Bind values X - Client
Row count: - (3) only available for SELECT queries Server
Cassandra specific      
Consistency Level X X Server
Query string X X

(1) Client: Datastax 1.0.2 library Server: Cassandra 1.2

(2) The keyspace is not available for BATCH statements as the statements within a batch may be executed on different keyspaces.

(3) As CQL3 protocol communication is asynchronous by default, record count information is only available if the server side is instrumented.

Available data is used for further processing if only the client or the server side is instrumented. This especially affects the execution time of CQL statements, especially when using asynchronous clients as asynchronous callbacks are currently not supported.

The following figures show the data available on client and server nodes respectively.

Tracing into the Cassandra server

In addition to understanding Cassandra calls, AppMon can also trace each call into the Cassandra server, making it possible to understand why certain statements were slow. This includes problems such as server-side garbage collection, CPU, disk, and memory problems.

In the Transaction Flow shows which applications communicate with your Cassandra cluster.

Client and Server Instrumented

The Cassandra server nodes display as JVMs with communication coming from your application. You get host and JVM health and can look at Method Hotspot Levels of a particular node right from the Transaction Flow. When you drill down to the PurePath, notice that each call to Cassandra is followed.

Drilldown to PurePath

If you look closely at the Elapsed Time column on the right of the screenshot, you'll notice a surprising ~-2ms difference between Synchronous Invocation and Synchronous Path (Thrift) methods. These slight inaccuracies can happen when timings come from different Agents.

You can now understand the behavior of each call, as well as the impact of GC suspension or CPU problems. You also can see the latency between your application and Cassandra which may heavily influence performance.

Monitoring Cassandra server

In addition to the details shown here, there is an Apache Cassandra Fastpack. It features a new Measure Group specifically for Cassandra server nodes and out-of-the-box Dashboards for monitoring of Cassandra.

Supported client and server versions

Currently, the sensors shipped with AppMon support the following Hector and Cassandra versions:

Supported Versions
Thrift
Supported Versions
CQL3 binary
Client Hector
1.0-1
1.0-3
1.0-5
1.1-2

Astyanax
1.56.44
1.56.48
Datastax
1.0.2
2.1
Cassandra 1.0
1.1
1.2
1.2
2.0

Indirectly Supported Clients

  • Firebrand, an ORM Client that uses Hector underneath.

Although the sensors are specifically written for Hector, all other clients supporting the Apache Thrift protocol (0.6.x, 0.7.x) also show Cassandra calls and tracing works. Information about the Keyspace or the Cassandra host that it is executed against is available only for Hector clients.

Partly Supported Clients