Brian Perrault
Brian Perrault – System Engineer

Thanks to our guest blogger Derek Abing and his co-author Brian Perrault – both System Engineers with a leading insurance company focusing on application performance. More details about their work at the end of this blog.

In production support it is often hard to correlate what might be happening on local servers with what users are reportedly experiencing.  In April, the developers for a java application that handles electronic distribution of scanned mail and electronic faxes were receiving reports that their application was running slowly from remote offices and came to our Performance Availability and Capacity Management (PaCMan) team for help in determining the cause of this issue.  From vanilla server side dynaTrace, everything was looking fine.  No particular transaction was taking a substantial amount of time.  This led us to believe that the problem may lie more on the client side.  Coincidentally, we were in the middle of a Proof of Concept for Dynatrace User Experience Management (UEM) so we decided to apply these efforts towards this application to help identify the issue.

Getting insight into the actual click paths of end users and the load behavior of pages on certain browsers allowed us to improve Client Rendering Time by 47%, Overall Page Load Time by 29% as well as implementing a new feature in Struts that prevents users from impatiently clicking Save too many times causing problems on our server-side implementation.

Finding #1: JavaScript Load Behavior causing problems on IE7

After configuring UEM and collecting data for a couple of days the analysis began.  The first thing to become readily apparent with UEM was that a large amount of time was being spent on the client side for rendering.  This was done by putting the server side contribution, network contribution, and estimated client time on the same graph.  The application developers set on correcting this immediately.  The client browser version was changed to IE 9.0 from IE 7.0, and several common Web performance Optimization changes, such as changing the load behavior of JavaScript files, were made in order to reduce the render time.  Figure 1 below shows the amount of time spent in a typical work week on the client side before (dashed line) and after (solid line) these changes were implemented.  This resulted in an average of 608ms (47.57%) reduction in client side rendering time.

Figure 1: Changing JavaScript load behavior helped to improve Client-side Rendering Time by 47%
Figure 1: Changing JavaScript load behavior helped to improve Client-side Rendering Time by 47%

Finding #2: “Impatient” trigger save action

UEM also gives you the ability to directly correlate user actions at the client with what happen on the server.  Using this ability, we were able to identify that some users were exacerbating their slowdowns by repeatedly clicking buttons and hitting refresh while they were experiencing an issue.  The developers plan to fix this issue by implementing an Apache Struts feature to allow only the first press of a button to cause an action.  Figure 2 below shows a user that was experiencing slowdowns due to network latency (viewable in UEM), but was increasing their slowdowns by repeatedly clicking the “Save to XXXX” button and the refresh button before the page had finished loading.

Figure 2: Getting insight into end users’ actions reveled problems with users impatiently clicking Save several times
Figure 2: Getting insight into end users’ actions reveled problems with users impatiently clicking Save several times

Finding #3: jQuery download impacting page load time

UEM also gave the application developers insight into the individual components and web calls made for each page.  This allowed them to view how even though individual actions on the server were running fine, multiple server functions may have contributed to a single web page.  This view also allowed the developers to see how many resources were being downloaded for each page.  This showed that in some rare cases the resources were taking a long time to load.  The developers are in the midst of implementing resource caching in order to mitigate this issue.  Figure 3 shows a waterfall diagram of a user action (click on “Access Delegated Work Lists”) where there was a long loading JavaScript file that caused the overall page load time to be much higher.

Figure 3: Analyzing real end user page load details revealed very long load time for jQuery
Figure 3: Analyzing real end user page load details revealed very long load time for jQuery

Final Result: Huge Improvement of End User Experience

With just the changes that the application team has already put in place they have reduced the overall response time for their web pages by an average of 538.53ms (29%).  This has also lowered the peak load time of the week by 894.51ms (27%).  Figure 4 below shows the change in user response time for an average work week before (dashed line) and after (solid line) the changes were implemented.  The PaCMan and application developers will continue looking into other statistics revealed via UEM in order to further fine tune the application.

Figure 4: Changes improved Page Load Time by 29%
Figure 4: Changes improved Page Load Time by 29%

A special Thank You to the two authors

Derek Abing and Brian Perrault are System Engineers with a leading insurance company focusing on application performance. They are helping transform dynaTrace into greater APM solution, and a greater Enterprise Monitoring solution. They also drive the creation and enhancement of various Action and Monitoring plugins which helps their organization collect, analyze, predict, and report the performance of key applications and infrastructure. Additionally, they are an integral part of shifting the organization’s paradigm from reactive remediation toward proactive management. Derek and Brian are active members on the APM Community and have been named Dynatrace’s Most Valuable Community Contributors (dTMVCC) for two years in a row. They have also obtained Dynatrace’s APM Associate Certifications.

*Feature image attribution Brain Dump