Adapt your System Profile
For instrumentation of the Cassandra client API, you have to inject the AppMon Agent, and the Cassandra Sensor must be placed in those Agent Groups that make calls to Cassandra. The Sensor needs to be explicitly placed in new Agent Groups as shown in the following figure.
Cassandra Sensor Placement
For analysis of your Cassandra client code, instrument your application by injecting the AppMon Agent (see Java Agent Configuration).
To trace Cassandra calls from client to server, add a new Agent Group for the Cassandra server node. Make sure that this Agent Group has the Cassandra Sensor placed. As Cassandra does a lot of threading, you may want to disable the Thread Activation Sensor to reduce the amount of data collected. However, in order to be able to see Cassandra CQL3 protocol based statement execution, the Executor Tagging sensor needs to be active.
Then add the AppMon Agent parameter to the JVM_OPTS in the Cassandra server configuration file and restart your nodes.
In the Transaction Flow Dashlet Dashlet, Cassandra calls are visualized like database calls. In contrast to traditional databases, Cassandra usually runs in a cluster (meaning multiple server nodes). Consequently, multiple database nodes appear with the name of the Keyspace that you execute against and the host/port.
- Execution time contribution
- Number of calls (in response time mode)
- Number of calls/min (in topology mode)
When you hover over a Cassandra database node, you can also look at the specific Cassandra calls. You can do the same by drilling down from a particular JVM to the database.
Typical Cassandra Transaction Flow
In the Database Dashlet Dashlet, Cassandra calls based on the Thrift protocol are displayed in a SQL-like form. Cassandra calls, based on the native CQL3 protocol, directly visualize CQL statements. Each CassandraNode/Keyspace combination is visualized as a separate Database Pool, making it easy to check. For suspicious statements, the actual execution and the attached details can be reviewed.
The Database Hotspots Dashlet Dashlet shows where your Cassandra calls originate.
PurePaths - thrift based
The PurePath shows each Cassandra call by its client method. The Argument column shows details like Keyspace, ColumnFamily, and ConsistencyLevel.
Statement details (right-click on a statement to view details) provide further information like the number of manipulated rows (row count).
Cassandra Statement Details
Depending on the executed Cassandra method, this value contains (among others):
|Operation||Comment||Example Visualization in Database Dashlet|
|batch_mutate||Number of mutations sent to the server||batch_mutate USING QUORUM|
|get||number of returned rows (0 to 1)||get FROM TravelKeyspace.Journey USING QUORUM|
|get_count||returned value||get_count FROM TravelKeyspace.Journey USING QUORUM|
|get_range_slices||number of returned slices||get_range_slices FROM TravelKeyspace.JourneyTags USING QUORUM|
|get_slice||number of returned columns||get_slice FROM TravelKeyspace.Journey USING QUORUM|
|multiget_count||size of returned map||multiget_count FROM TravelKeyspace.Journey USING QUORUM|
|multiget_size||size of returned map||multiget_slice FROM TravelKeyspace.JourneyTags USING QUORUM|
PurePaths - CQL3 binary protocol-based
Depending on the injected agents, client- and/or server-side communication is visualized on the PurePath.
If both sides (client and server) are instrumented, the client method node start the PurePath on which the database communication takes place (client-side Cassandra CQL3 protocol calls do not automatically start PurePaths). The following figure shows a sample PurePath with client and server instrumented.
CQL3 PurePath - Client and Server Instrumented
The PurePath nodes marked with 2 and 3 specify the CQL3 protocol communication endpoints: methods where CQL3 protocol messages are sent from the client (2 - Connection.write) and received on the server (3 - Message$Dispatcher.messageReceived). All CQL3 protocol messages (such as server startup and statement execution) can be traced using this mechanism.
CQL3 statements may be processed within a Session (node marked with 1) or may be sent directly to Cassandra using Connection.write (nodes marked with 2). CQL statement data is available on both the client (on node Connection.write - 2) and server side (on the respective Statement.execute node - 4). However, the available data on the client slightly differs from that on the server side. The following table shows what data is available on the client and the server and what data is used for further processing in the database dashlet:
|Data||Client (1)||Server (1)||Database dashlet uses data from|
|Database Name:||may not be set||X (2)||Server|
|Database Type:||Cassandra by default||Cassandra by default||Server|
|Database Pool Name:||default||default||Client|
|Database Pool Size:||X||Client|
|Database User (default):||X||Server|
||only available for SELECT queries||Server|
(1) Client: Datastax 1.0.2 library Server: Cassandra 1.2
(2) The keyspace is not available for BATCH statements as the statements within a batch may be executed on different keyspaces.
(3) As CQL3 protocol communication is asynchronous by default, record count information is only available if the server side is instrumented.
Available data is used for further processing if only the client or the server side is instrumented. This especially affects the execution time of CQL statements, especially when using asynchronous clients as asynchronous callbacks are currently not supported.
The following figures show the data available on client and server nodes respectively.
Tracing into the Cassandra server
In addition to understanding Cassandra calls, AppMon can also trace each call into the Cassandra server, making it possible to understand why certain statements were slow. This includes problems such as server-side garbage collection, CPU, disk, and memory problems.
In the Transaction Flow shows which applications communicate with your Cassandra cluster.
Client and Server Instrumented
The Cassandra server nodes display as JVMs with communication coming from your application. You get host and JVM health and can look at Method Hotspot Levels of a particular node right from the Transaction Flow. When you drill down to the PurePath, notice that each call to Cassandra is followed.
Drilldown to PurePath
If you look closely at the Elapsed Time column on the right of the screenshot, you'll notice a surprising ~-2ms difference between Synchronous Invocation and Synchronous Path (Thrift) methods. These slight inaccuracies can happen when timings come from different Agents.
You can now understand the behavior of each call, as well as the impact of GC suspension or CPU problems. You also can see the latency between your application and Cassandra which may heavily influence performance.
Monitoring Cassandra server
In addition to the details shown here, there is an Apache Cassandra Fastpack. It features a new Measure Group specifically for Cassandra server nodes and out-of-the-box Dashboards for monitoring of Cassandra.
Supported client and server versions
Currently, the sensors shipped with AppMon support the following Hector and Cassandra versions:
|Supported Versions Thrift||Supported Versions CQL3 binary|
|Client||Hector 1.0-1 1.0-3 1.0-5 1.1-2 Astyanax 1.56.44 1.56.48||Datastax 1.0.2 2.1|
|Cassandra||1.0 1.1 1.2||1.2 2.0|
Indirectly Supported Clients
- Firebrand, an ORM Client that uses Hector underneath.
Although the sensors are specifically written for Hector, all other clients supporting the Apache Thrift protocol (0.6.x, 0.7.x) also show Cassandra calls and tracing works. Information about the Keyspace or the Cassandra host that it is executed against is available only for Hector clients.