Does your application architecture include Message Queues to feed work items to backend batch processing such as “Update Product Inventory” or “Send out Notification Emails”? Message Queues work well in these use cases as they decouple your system components and allow your backend processing to asynchronously process requests. This decoupling allows the queue to grow under heavy load allowing the background job to catch up when there is less load on the system resulting in an evenly distributed work load throughout the day.

This is a big advantage over synchronous processing, but what happens if your background jobs can’t keep up with all incoming requests? This is a more complex problem than it might seem at first!  For example, you could add more background processes, reduce the number of items flowing into the queue, optimize your background job implementation, and many other possibilities. Where and how should you start optimizing?

This post follows an interesting problem we worked through with one of our customers.  Their eCommerce web application was adding about 40 inventory item update messages per minute into a processing queue. Unfortunately only six messages per minute could actually be processed by the background job, which caused a huge backlog and lead to badly outdated inventory information in the web application and internal systems.

Instead of adding additional background processing jobs they analyzed the performance of the background process itself, identifying the WebSphere XML parser as performance bottleneck. After replacing it with Apache Xalan, they increased performance by 1000%, and could now process at least 60 messages per minute! The following chart shows the throughput of two queues in their system and the jump in de-queuing actions when they switched to the alternative XML Parser at about 10:20AM for the first queue worker process and at about 10:45AM also applying the changes to the second background process working on the second queue:

After switching to Apache Xalan the batch processing could process 10 times more messages from the queue
After switching to Apache Xalan the batch processing could process 10 times more messages from the queue

How they Analyzed the Bottleneck in Minutes

To understand why the background job was handling messages so slowly, they began by analyzing the executions of each individual message that was pulled out of the queue. It turned out that the execution time of each individual job took between 8 and 212 seconds. On average, a single message took about 10 seconds to process, resulting in the slow average throughput of 6 messages per minute. The following screenshot shows a selection of these background jobs:

Looking at each individual job execution makes it clear that 99% of the time was spent in I/O and took an average of 60 seconds
Looking at each individual job execution makes it clear that 99% of the time was spent in I/O and took an average of 60 seconds

Looking at the methods that contribute to these PurePaths made it obvious that all this time is spent in the IBM XML Parser when running through the XSLTCompiler:

99% of the time is spent in the compile method invoked by the customer’s transformXml implementation – that’s the hotspot to focus on
99% of the time is spent in the compile method invoked by the customer’s transformXml implementation – that’s the hotspot to focus on

Looking at the performance contribution of that XML Transformation over a longer period of time also validates that this is the main reason for the slow batch processing. It impacts every single message in the queue:

Proof that the extremely poor performance of TransformXml is not an occasional problem, but instead occurs for every message processed
Proof that the extremely poor performance of TransformXml is not an occasional problem, but instead occurs for every message processed

After further investigation they discovered that the bad performance was in part caused by over 37000 Java Exceptions thrown within the execution trace of the IBM XML Parser:

More than 37000 exceptions are thrown while the IBM XML Parser processes the XML Content causing most of the performance overhead
More than 37000 exceptions are thrown while the IBM XML Parser processes the XML Content causing most of the performance overhead

Solution: Moving to a Different XML Implementation

After seeing the performance of the standard XML Parser shipped with IBM WebSphere they moved to Apache Xalan. This improved message processing throughput from 6 to 60 per minute – a 1000% increase! As explained in the introduction paragraph. They have two queues in total. They applied the software change to the first background process working on the first queue at about 10:20AM seeing a jump from about 6 messages to 60 messages per minute. After about 20 minutes they also applied that change to the second queue which is reflected by the second jump in throughput as shown in the graph.

Significantly higher message throughput after switching XML Parsers, showing a 10X performance improvement
Significantly higher message throughput after switching XML Parsers, showing a 10X performance improvement

Want to share your stories?

If you find stories like this useful and if you want to share your own stories with the fellow readers we have on this blog let us know. Either post a comment on this blog or send me an email: andreas.grabner@compuware.com. There are also more blog posts like this that I’ve done in the past. So – if you found this one useful and interesting check out the following posts as well: Optimize Load Balancers, Analyzing Stuck Transactions or How to Triple Application Throughput.