Performance Analysis: Identify GC bottlenecks in distributed heterogeneous environments

Garbage Collection can have a major impact on application performance. The more distributed an application becomes the trickier it is to identify the impact on the overall transaction response times. If you are dealing with heterogeneous systems it is even harder because the set of tools out there usually don’t cross runtime and technology boundaries.

Identifying bottlenecks across runtime and technology boundaries

Here is my approach about how to identify GC bottlenecks in distributed heterogeneous environments. My application consists of the following 4 tiers:

  1. Java Servlets hosted in JBoss acting as Frontend Server
  2. Java WebServices hosted in JBoss providing business logic to Frontend Server
  3. ASP.NET Web Application hosted in IIS acting as Frontend Server and offering public Web Services to the “outside world”
  4. ASP.NET WebServices hosted in IIS providing business logic to ASP.NET Web Application and to the public web services

I ran some load against the application executing typical transactions that caused the application to cross all 4 tiers for most of the web requests that I simulated with the load testing tool.

In order to identify GC related bottlenecks on either of my 4 involved components I additionally monitored the following counters:

  • Total execution time of my individual transactions
  • Total execution time of my individual transactions excluding Runtime Suspensions (Time spent by GC)
  • Total CPU time of my individual transactions
  • On the individual Runtimes (Java & .NET) I monitored the time spent in the GC
  • Execution time of all individual logical layers of my application

Here is a sample dashboard that gives me a quick overview of how my system performs and where time is spent:

Performance Dashboard of distributed heterogenous application
Performance Dashboard of distributed heterogenous application

Explaination of the 3 graphs

Time spent on CPU/Total/Total without Suspension (Java & .NET)

This graph shows me that little time is actually spent on the CPU (GREEN) when my code is executed. The PurePath Duration w/o Suspension (RED) indicates the total time taken by the code (excluding the time used by the GC). The gap between RED and GREEN therefore is the time my code had to wait on external systems, e.g.: database, network, I/O or spent in syncronisation. The PurePath Duration (YELLOW) indicates the total execution time including the time spent by the GC. In my case it shows me that there is quite some time spent in the GC (gap between RED and YELLOW).

Time spent by Garbage Collector (Java & .NET)

This graph shows the time spent in the GC by the individual 4 Java % .NET Runtimes. This gives me an easy overview in which of my tiers the GC has to do most of its work. The two green values are taken from the Java Runtimes (DARK GREEN = Java Frontend, LIGHT GREEN = Java Web Services). The two red values are taken from the .NET Runtimes (DARK RED = .NET Frontend, LIGHT RED = .NET Web Services).

Application Layer Breakdown (Java & .NET)

The way to dive deeper into the actual problematic components of the application is by looking at the execution times of the individual application layers. Each color represents a layer of my application. The LIGHT BLUE for instance represents time spent in the JDBC layer, LIGHT GREEN indicates the persistance layer on the Java Web Service Runtime.

Drilling into a Transaction

Having identified an area that we want have a closer look at – I can now drill down into those transactions that had a high GC time. Drilling into the individual PurePath (which represents a single transaction) or just looking at the methods that have been executed in the particular timeframe shows us where the GC influenced the overall performance:

Runtime Suspension Time (GC Time) in a single transaction
Runtime Suspension Time (GC Time) in a single transaction


There are some easy ways to identify which components of your application contribute to the overall performance and whether GC plays a big role to it. Use the different performance counters that are available and draw your conclusions.

Andreas Grabner has 20+ years of experience as a software developer, tester and architect and is an advocate for high-performing cloud scale applications. He is a regular contributor to the DevOps community, a frequent speaker at technology conferences and regularly publishes articles on You can follow him on Twitter: @grabnerandi