Why Web 2.0 requires End-To-End Performance Analysis and How to Do It

Web Site Performance is impacted by many factors and it also impacts your business. When a user interacts with the web site and it feels slow it can be caused by slowly executing JavaScript, massive DOM Manipulations, a slow network connection, latency, an overloaded web server, slow running server-side code or inefficient database queries. Web Applications have changed over the years – so has the performance anatomy of Web 1.0 vs. Web 2.0 Applications.In our two recent Webinars with Zappos and Monster the Performance Engineers of these two clients talked about the challenge of Web 2.0 Performance Management and how they use dynaTrace’s End-To-End Tracing capability to improve their web-sites end-user experience.

Performance Anatomy of a Web 1.0 vs. Web 2.0 Applications

In Web 1.0 all Application Logic happened on the server. The generated web page was transferred via the network to the browser which simply rendered the content. Performance Management was focused on the following items: performance of application code that generated the returned content, generated size of content, caching strategy and network performance.

Web 1.0 Performance Anatomy
Web 1.0 Performance Anatomy

When we look at a modern Web 2.0 Application we see that we have many more moving parts than with traditional Web 1.0 Applications. With Web 2.0 the application logic expanded from server-only to server-and-browser. JavaScript frameworks and more powerful browsers allow developers to leverage the browser as an application platform. Performance Management therefore needs to start at the browser and needs to span the full application platform starting at the JavaScript & Rendering Engine of the Browser over the Network (including asynchronously loaded content via AJAX/XHR) back to the Application Server and Database.

Web 2.0 Performance Anatomy
Web 2.0 Performance Anatomy

Problem Patterns in Web 2.0 Applications

The problem patterns in Web 2.0 Applications are similar to Web 1.0 Applications but they have a different impact on overall performance as the application infrastructure now includes the browser and relies more on the network connection for asynchronous message exchange. Let’s have a look at 3 problem areas and how they impact performance:

Interactive User Interface

Interactivity is one of the big benefits of Web 2.0 Applications. It is accomplished by modifying the DOM (Document Object Model) via JavaScript, by making use of CSS (Cascading Style Sheets) and by loading partial content on-demand via asynchronous message exchange.

Frameworks like jQuery, Prototype, Google Web Toolkit, YUI, … make it easy for developers to add interactivity to their web sites. Incorrect usage of these frameworks leads to long running JavaScript and expensive DOM Manipulations and negatively impact the experienced end-user performance. Check out the following blog entries about the performance impact of jQuery and Prototype. It shows how important it is for developers to understand the internals of frameworks in order to prevent performance problems.

Chatty Components

A well-known problem from the server-side is components that are very “chatty”. Having components that exchange many messages for simple transactions lead to performance problems caused by the overhead of message serialization/deserialization and message transmission. The ability to send asynchronous messages from the browser to the server to request additional content or to update the server with status messages is a great thing. What needs to be considered is that these messages need to be serialized/deserialized on both sides (browser and server) and need to be transmitted over a rather slow network connection (slow compared to LAN connections).

JavaScript Frameworks make it easy to implement asynchronous message exchange. Some of these frameworks that run on both the client and server-side often exchange messages without the developer knowing about it, e.g.: to synchronize status of a server and client-side object. It is therefore important for developers to understand the actual message exchange. Chatty components not only suffer from slow internet connections but also cause additional overhead on the server leading to a bad end-user performance experience.

For an example of how to analyze chatty components read the End-To-End Performance Analysis with Dynatrace AJAX and Dynatrace server editions paragraph.

Problematic 3rd Party Libraries

I already mentioned JavaScript frameworks that are used for interactive User Interfaces. The main problem of these are expensive DOM Manipulations that lead to high CPU overhead in the browser and impact the end-users performance experience of the web site.

There probably are libraries like there are stars in the sky. Most of them are free to download and therefore very attractive to include in web sites as they solve many problems of developers, e.g.: creating a fancy dynamic popup menu. Before using a library, however, a developer must get familiar with the internals of the library. These libraries are often implemented to solve a certain use case for a particular web site. Even though it looks like a library also solves your particular problem developers need to test them out in their specific environment. The following blog post for instance shows how a dynamic menu library works great on small menus but how performance suffers once menu sizes grow: Performance Analysis of JavaScript Menus

End-To-End Performance Analysis with Dynatrace AJAX and Dynatrace server editions

I have chosen Magnolia – an open source content management system – to demonstrate how to do End-To-End Performance Analysis starting from the Browser all the way back to the Database. I use the FREE Dynatrace AJAX Edition to analyze the browser activities in combination with Dynatrace Test Center Edition for server-side performance analysis.

Step 1: Setting up Dynatrace APM

I start by creating a System Profile for Magnolia including all standard Sensor Packs that Dynatrace provides for Java based Applications. This includes Java Servlets, JDBC, Java Web Services, Exceptions, JMS, EJBs, RMI, Hibernate, … and many more. Additionally to this out-of-the-box configuration I add my own custom sensors for the individual magnolia packages using the Dynatrace Sensor Assistant to come up with a good set of instrumentation. I inject the Dynatrace Agent into the Magnolia startup script and I am ready to go.

Step 2: Executing the Use Case to be analyzed with Dynatrace AJAX

Using Dynatrace AJAX Edition I browse through the Use Case scenario I want to analyze. Dynatrace AJAX Edition opens an instance of Internet Explorer and captures all the activities that happen in the browser. This includes downloads of network resources from the server, JavaScript executions, DOM Manipulations and Rendering activity. As Dynatrace APM is running as well every interaction with the Web Site automatically captures a Server-Side PurePath which includes the complete execution path of the server-side code starting from the Servlet Handler through custom code back to the database. I enable Session Recording so that all captured PurePaths will be available for me in a Dynatrace Session for later offline analysis.

The actual use case I tested was logging into the administrative console of Magnolia, clicking through the dynamic trees and changing configuration values. After I am done I close the browser – go back to Dynatrace AJAX Edition and start with the Summary View.

Performance Overview of Browser Activities
Performance Overview of Browser Activities

Step 3: Analyzing the Browser Activities

The overview shows us that I had 3.7 seconds in JavaScript execution time, 1.7 seconds in rendering activities and 6.5 seconds spent in network downloads. It also shows me that I had several XHR requests to the server – visualized by the two arrows in the event column. This already tells me that the application is kind of “chatty” with the server. There is one particular request that took almost 2 seconds to return from the server preceded by JavaScript activity that lasted for more than 1 second. I drill into the Timeline view for the administrative page and turn on additional events like mouse and keyboard interactions. I also zoom into the time region that is of interest for me:

Browser Activity of Interest - including JavaScript and XHR Request
Browser Activity of Interest – including JavaScript and XHR Request

Hovering with the mouse over the mouse click event icon and the XHR icon shows which mouse event actually caused the 1.1 second JavaScript execution (click on <span>Delete</span>) and which XHR Network Request took 1.8s (…/dms.html).

A double click on the mouse event icon opens up the PurePath view showing all 3 JavaScript event handlers for the mouseup, click and mousedown:

Ajax PurePath showing JavaScript Traces and XHR Request to the Server
Ajax PurePath showing JavaScript Traces and XHR Request to the Server

The confirm method call took most of the time of this JavaScript call. It was basically the time I as the user waited untill I clicked on the Confirm button. After confirm was clicked we can see the JavaScript execution that executed the asynchronous request to the server. Clicking on the Network Request shows the actual content that was returned in the bottom right. In the tree view we also see that the request took 1.8s. Here comes the point where we want to analyze the server-side activity in order to get a full End-To-End Picture of this JavaScript event handler.

Step 4: Analyze Server Side Activity

Once Dynatrace APM runs on the server that is analyzed with Dynatrace AJAX Edition you can drill from a Network Request in the Dynatrace AJAX Edition to the Server-side PurePath that was captured by Dynatrace APM for that particular request. This can be done via the Context Menu in the PurePath tree as shown in the previous screenshot or via the Context Menu in the Dynatrace AJAX Edition Network View:

Network View showing a PurePath icon where we have a Server-side PurePath
Network View showing a PurePath icon where we have a Server-side PurePath

The actual Drill-Down opens the Server-side PurePath in question in the Dynatrace Client and allows us to analyze the request and why it took 1.8s to respond to the JavaScript/XHR Request. The first high-level view shows us some key characteristics of the server-side transaction including the PurePath Hot Spots visualization that shows us the hotspot methods in this particular transaction:

PurePath Overview and Hotspot
PurePath Overview and Hotspot

We can see that the PurePath took 1.874 seconds on the server. The next really interesting observation is that we had 1.4 seconds in wait time – meaning that the application had to wait on an object for 80% of the total execution time. The Hot Spots visualization shows us which methods contributed to the PurePath. Hovering the mouse over the big block shows us it is the doDeactivate method. Clicking on that block automatically brings us to the most problematic method in the PurePath:

Server-side PurePath of the problematic XHR Request
Server-side PurePath of the problematic XHR Request

The Server-side PurePath shows the full execution trace of the XHR Request that was sent by JavaScript. The most problematic method – doDeactivate – is highlighted in red showing a total execution time of 1.5 seconds where 1.48 was wait time and 0.16 was caused by a runtime suspension (Garbage Collection).

Additionally to the method execution times we get additional context information like method arguments (e.g.: delete for the execute method) – all thrown exceptions including exception message and full stack trace, SQL statements and their bind values and full details on the servlet request:

Servlet Details including parameter information
Servlet Details including parameter information

Going back to the problematic method we identified. Dynatrace even allows us to lookup the source code of this method that spends 1.48 seconds in wait. If we have the source code available we can look up the code in Eclipse using the dynaTrace Eclipse Plugin. If you do not have the code available we can choose to see a decompiled version of this code:

Lookup Source Code of problematic Method
Lookup Source Code of problematic Method

This code-lookup feature works for both Java & .NET where Dynatrace also integrates with .NET Reflector to show disassembled code.

Conclusion: Why we need the End-To-End View

Modern Web Applications include the browser as part of the application platform. When it comes to performance management of those applications it is therefore necessary to start at the browser and go all the way back to the server and database. The more frameworks (server and browser-side) you add to the mix the more complex the application and the more critical it becomes to understand what is going on from end-to-end for every user interaction in order to understand and tune web site performance. In the end it is about giving your end-users the best possible experience on your website in order to keep them as users. The FREE Dynatrace AJAX Edition for browser performance analysis and the integration with dynaTrace APM offers full End-To-End tracing and enables performance management for the new challenges of Web 2.0 Applications.

As noted in the opening paragraph – check out our recent webinars with Zappos and Monster and see how they use the Dynatrace AJAX and Dynatrace APM Solution to manage Web Site Performance of their Web Sites.

Andreas Grabner has 20+ years of experience as a software developer, tester and architect and is an advocate for high-performing cloud scale applications. He is a regular contributor to the DevOps community, a frequent speaker at technology conferences and regularly publishes articles on blog.dynatrace.com. You can follow him on Twitter: @grabnerandi