Hands-On Guide: Verifying FIFA World Cup Web Site against Performance Best Practices

Whether you call it Football, Futbol, Fussball, Futebol, Calcio or Soccer – if you are a fan of this game I am sure you are looking forward to the upcoming FIFA World Cup in South Africa. The tournaments web site is http://www.fifa.com/worldcup and allows the fans to follow their teams and get the latest updates on scores, standings, schedule, ticketing or hospitality. Only the best performing teams in the qualification matches made it to the tournament and only the best performing team will end this tournament as new world champion.

As I’ve done with other sport events such as the Winter Olympics in Vancouver or the Golf Masters I want to take you through a Step-By-Step analysis of different pages on the FIFA site based on Web Performance Best Practices that have been established over the last couple of years – such as the ones from Google and Yahoo. FREE tools such as the Dynatrace AJAX Edition, Yahoo’s YSlow and Google’s PageSpeed make it easy to perform these analytic steps identifying issues that could easily become to real performance problems once the web site is really hit by many users.

My analysis of the FIFA site shows that – once the World Cup starts next week and the site gets really hit by millions of users around the globe – there is a big chance the site will run into performance and scalability issues due to several best practices that my analysis shows the site does not follow. This failure causes load times of the initial page take more than 8 seconds and requires downloads of more than 200 elements. These problems can easily be fixed by following the recommendations I highlight in this blog. Let’s get started:

What needs to be analyzed to identify a slow page?

Over the last months I developed my own approach when looking at a web site. First of all I look at Key Performance Indicators (KPI) such as Time to First Impression, Time to onLoad, Time till page is Fully Loaded, Number of Requests, Time Spent in JavaScript

These KPI’s allow me to quickly identify whether we have some serious problems on a page or not. After that it is time to look into 4 specific areas where improvements can be made: Usage of Browser Caching, Network Resources and Transfer, Server-Side Processing Time and JavaScript Execution.

Key Performance Indicators (KPI’s)

There are many interesting metrics one can read when analyzing a web page. Feel free to use any of the tools I mentioned above or use other tools such as Fiddler, HTTP Watch, … (I am sure there are plenty out there that have different approaches to get to the same data). It is important to do the analysis across multiple browsers as different browsers have different performance characteristics. I am going to show the steps for analysis with Dynatrace AJAX Edition as it works on IE 6, 7 and 8 and it also allows me to do deep JavaScript performance analysis on top of network and traffic analysis.

Let’s start by analyzing the start page of the FIFA World Cup Web Site. Before I capture the performance information I make sure to clear the Browser Cache to experience the page as a first time visitor (dynaTrace AJAX provides an option to clear the cache for you – configurable in the Run Configuration Setting). The following image shows the Summary View of the initial page:

First Key Performance Indicators (# Roundtrips, Time in JavaScript, Time in Rendering)
First Key Performance Indicators (# Roundtrips, Time in JavaScript, Time in Rendering)

On this Summary View we can see how much time was spent in total in JavaScript, Network and Rendering. It also shows us how many Network Roundtrips we had with a detailed break down in individual Mime Types. Let’s now look into the Timeline View. Based on a previous blog post where I explained how to measure KPI’s such as “Time to First Impression” a look at the Timeline allows us to get these additional KPI’s:

Next KPIs: Time to First Drawing, Time to onLoad and Full Page Load
Next KPIs: Time to First Drawing, Time to onLoad and Full Page Load

Now – let’s have a look at the Network View to check on detailed download graph. This view shows us which domains serve which resources and how the browser actually downloads the individual resources. We can spot redirect requests (HTTP 3xx), authentication issues (HTTP 4xx) and Errors (HTTP 5xx). The DNS and Connect Time tells us whether we have to deal with some expensive domains in terms of establishing a physical connection. The Wait Time tells us whether individual resources have to wait a long time to be actually downloaded due to the physical network connection limitation. The Server Time tells us whether the server takes a long time to respond to a request – indicating a server-side processing problem. Finally the Size and Transfer Time tells us whether we have a problem with large content and latency:

KPI's from Network View: # of Redirects, Size of Resources, Impact of Wait Time and Slow Server-Requests
KPI’s from Network View: # of Redirects, Size of Resources, Impact of Wait Time and Slow Server-Requests

Here is a summary of all KPI’s that we can read from the previous three views – and let me explain what they mean to me and what values I consider to be good or acceptable or not acceptable:

  • Time to First Impression/Drawing: 3.74s
    • Analysis: so it takes almost 4s until the user sees a visual indication of the page load – that is definitely too long and should be improved
    • Recommendation: < 1s is great. <2.5s is acceptable
  • Time to onLoad: 8.25s
    • Analysis: it takes the browser 8.25 to download the initial document plus all referenced objects before it triggers the onLoad event that allows JavaScript to modify the page after it has been loaded – again – much too slow as nobody likes to wait 8s until the content is loaded
    • Recommendation: < 2s is great. <4s is acceptable
  • Time to Fully Loaded: 8.6s
    • the page loads additional resources triggered by JavaScript onLoad handlers. I consider the page as fully loaded when all these additional requests are downloaded. I guess I don’t need to mention that 8.6s is not fast 🙁
    • Recommendations: < 2s is great. <5s is acceptable
  • Number of HTTP Requests: 201
    • Analysis: 201 – that’s a lot of elements for a single page. We have seen many images that are the main contributor to this load. My first thought on this -> let’s seen how we can reduce this number by e.g.: merging files (more details later)
    • Recommendations: < 20 is great. < 100 is acceptable (This one is a hard recommendation as it really depends on the type of website – but – it is a good start to measure this KPI)
  • Number and Impact of HTTP Redirects:1/1.44s
    • Analysis: This is a very expensive and it seems unnecessary redirect from http://www.fifa.com/worldcup to http://www.fifa.com/worldcup/
    • Recommendations: 0. Avoid Redirects whenever possible
  • Number and Impact of HTTP 400’s: 1/0.71s
    • Analysis: There seems to be a javascript file that results in a HTTP 403 Forbidden Response and takes a total of 0.71s.
    • Recommendations: 0. Avoid any 400’s and 500’s
  • Size of JavaScript/CSS/Images: ~370kb/220kb/890kb
    • Analysis: Size of individual mime types is always a good indicator and helps to compare to other sites and other builds. 370kb of JavaScript and 220kb of CSS can probably reduced to a smaller size by using certain minimization techniques or by getting rid of unused code or styles
    • Recommendations: It is hard to give a definite threshold value. Keep in mind that these files need to be downloaded and parsed by the browser. The more content there is the more work on the browser. The goal must be to remove all information that is not needed for the current page. I often see developers packing everything in a huge global .js file. That might be a good practice but too often only a fraction of this code is actually used by the end-user. It is better to load what needs to be loaded in the beginning and delay load additional content when really needed
  • Max/Average Wait Time: 4.31s/1.9s
    • Analysis: this means that resources have to wait up to 4.3s to be downloaded and that they have to wait 1.9s on average. This is way to much and can be reduced by either reducing the number of resources or by spreading them on multiple domains (Domain Sharding) in order to allow the browser to use more physical connections.
    • Recommendations: < 20ms is good. < 50ms is acceptable (as you can see – we are FAR OFF these numbers in this example)
  • Single Resource Domains: 1
    • Analysis: from the timeline we can also see that there is one domain that only serves a single resource. In this particular case it seems to be serving an ad. We can assume that this might not be changeable but this KPI is a good indicator on whether it is worth paying the cost of a DNS Lookup and Connect if we only download a single resource from a domain
    • Recommendations: 0. Try to avoid single resource domains. It is not always possible – but do it if you can

The KPI’s tell me that the page is way too slow – especially the Full Page Load Time of 8.6s needs to be optimized. With the KPI’s we can already think about certain areas to focus on, e.g.: reducing the network roundtrips or minimizing content size. But there is much more. Let’s have a closer look into 4 different areas.

Usage of Browser Caching

Browsers can cache content such as images, javascript or css files. Caching elements greatly improves browsing behavior for revisiting users as they do not need to download the same resources again. In order for that to work the browser needs to be told which elements to cache and which not to cache. Very often these Cache-Control settings are not correctly set – they are either simply forgotten or just done wrong. Please refer to the Browser Caching sections in the two Best Practice Documents of Google and Yahoo and learn how to use Expires Headers and Cache-Control Settings correctly. In order to analyze cache settings I go ahead and record another session. This time I DO NOT clear the browser cache as I want to see which objects were retrieved from Cache and which were not. The Summary View now tells me how many objects were taken from the Cache and which were not:

Resource Chart shows how many objects were retrieved from the Cache - it SEEMS to be a lot
Resource Chart shows how many objects were retrieved from the Cache – it SEEMS to be a lot

It seems that cache settings were specified because the browser retrieved most of the images, javascript and css files from the Cache. A double click on the bar that represents the cached Images opens the Network View and lists all these images. And here we have something interesting to observe:

Short Expires Headers cause Browser to do a roundtrip to the server (IF-MODIFIED-SINCE Request)
Short Expires Headers cause Browser to do a roundtrip to the server (IF-MODIFIED-SINCE Request)

Even though the objects are taken from the Cache (as indicated in the Cached column) – the browser has to send a request to the web server to check if the cached object is still valid. Why is that? Because the Expires Header only sets a date/time that is roughly 30s in the future. So – a returning user has to send the same number of HTTP Requests to the server asking whether the content is still valid (IF-MODIFIED-SINCE). Even though these requests only return that the content is still valid we end up having large wait times due to the fact that there are so many resources served by the same domain.

Besides very short expires headers the page also contains a few that have an Expires Header that is set in the past. This might be on-purpose to prevent any caching of these resources – but it also often happens due to mis-configuration of the web server.

Summarizing the Browser Caching Analysis – we have

  • 175 resources that have an Expires Header no longer than 48 hours in the future. I took the 48 hours from the Best Practices of Yahoo and Google
    • Solution: Analyze these resources and set Far-Future Expires headers where it makes sense, e.g.: all the flags of the participating countries
    • Save Potential: 175 unnecessary roundtrips to the server, lots of network time and transfer size
  • 4 resources that expired in the past
    • Solution: Look into those objects and verify if they really shouldn’t be cached at all
    • Save Potential: 4 unnecessary roundtrips to the server, lots of network time and transfer size

Network Resources and Transfers

This area of analysis focuses on unnecessary requests for all users (not just for revisiting users caused by wrong cache settings). The first types of requests that should be avoided are HTTP Redirects (300’s), Authentication Issues (400’s) and Server-Errors (500’s). The 403 I’ve identified in the KPI section seems to not only occur on the initial page but on almost every page. This can be a result of a mis-configuration on the web server or a problem in the generated HTML code that includes this JavaScript file. The initial redirect (HTTP 300) also seems to be avoidable:

Avoiding Redirect and 403 saves more than 2 seconds
Avoiding Redirect and 403 saves more than 2 seconds

Next item on the list are resources that can be merged into fewer resources, such as merging CSS or JavaScript files merged into fewer files and with that reduce roundtrips. For images you can use a technique called CSS Sprites. A perfect example on this page is all the country flags. These 68 individual flag images can be merged into a single image reducing the number of HTTP Requests by 67.

While analyzing the flags I discovered an interesting “flaw” of the website. Maybe you already noticed it to. Why do we have 68 flags? There are only 32 countries playing in the tournament. Well – maybe they have different sizes of flags – that was my first assumption – and yeah – that is part of the discovery – but – it is not the real “flaw” I identified. The reason for that is that the small flag images are hosted on two domains (img.fifa.com and www.fifa.com). The initial HTML page references the images with an absolute path from the img.fifa.com domain as well as relative from www.fifa.com. The following illustration shows parts of the HTML Document from www.fifa.com that uses two different ways of referencing those flags. The illustration also shows the actual requests that are sent to the web server – it is easy to spot that the SAME country flags are downloaded twice from both domains:

Same image flag is downloaded from two different domains causing 32 unnecessary roundtrips
Same image flag is downloaded from two different domains causing 32 unnecessary roundtrips

Fixing this problem saves 32 roundtrips as the browser can just use the already downloaded images – or – if you follow the best practices on merging the images into a single image using CSS Sprites we end up downloading only 1 image instead of 64 (nice save – isn’t it?). In case you wonder why there are 68 flag requests in total? Reason for that is that some flags – depending on which page you are own – are also downloaded in medium and large size – thats why I had 4 additional flags that got downloaded.

Summarizing the Network Resources and Transfers – we have

  • One Redirect and one HTTP 403
    • Solution: figure out a way to get rid of them – especially the 403
    • Potential Savings: more than 2s of total network time + speeding up the initial download of the page by 1.4s when we get rid of the redirect. This will bring down the Time for First Impression, Time to onLoad and Time to Fully Loaded KPI’s
  • 32 duplicated downloads of flag images
    • Solution: change the src location of these flag images to be only taken from the img.fifa.com domain
    • Potential Savings: getting rid of 32 requests. By downloading them from img.fifa.com we also free up physical network connections on www.fifa.com to download things like javascript, css or swf files
  • 32 flag images for CSS Sprite use
    • Solution: Merge the 32 flags into a single image and use CSS Sprites
    • Potential Savings: reducing 32 requests to 1 -> saves 31 requests
  • ~100 additional images potential candidates for CSS Sprite
    • There are a total of 175 images on that page. So – besides the 68 flag images we have 100 more that are potential candidates for merging like the 19 sponsor logos or 10 organization logos
    • Potential Savings: we can probably get rid of another 50-70 requests on these images

Application Server-Side Processing Time

Once we solve all the deployment issues like making correct use of browser caching and optimizing network resources we are ready for some real load. Unless you are serving static content only increasing load usually has a negative impact on application server-side processing time. Why is that? Because that is when the application code actually needs to perform some work such as getting information from the database (who scored the goals in the opening match) or query external services (e.g.: how many tickets are still available for a certain game). The more requests the server has to handle the more pressure it puts on the actual implementation of the code and this often reveals problems on the server-side. These are problems that really hurt your business in case critical transactions such as buying a ticket don’t finish fast enough or actually fail under heavy load.

That is why we have to look at requests that actually cause the application to do work. How to identify those requests? If you know the application I am sure you have a good understanding about which requests are served by the app server, your web server or your CDN. We can also check the HTTP Response Headers to see whether the application adds some app-specific headers. Here’s a quick way – I look at requests that show one the following characteristics:

  • First request on the page -> usually returns the initial HTML
  • Requests that return HTML -> generated content (this also may include static HTML pages)
  • Requests on URL’s ending with aspx, jsp, php
  • Requests that send GET or POST parameters data to the server
  • All XHR/AJAX Requests

The following image shows the Network View with all those requests that meet my criteria showing me that a total of ~3.6s is spent in Server-Side Processing:

3.6 seconds in Application Server-Side Processing Time
3.6 seconds in Application Server-Side Processing Time

The Server-Column shows the Time to First Byte. This is as close to server-side processing time as we can get by analyzing the network requests that are sent by the browser. So – this is the time from the last byte sent from the HTTP Request until the first byte received. This also includes some network latency – but as I said, this is very close to the actual server-side processing time. When we want to get more accurate numbers we have to analyze the actual processing time on the application server itself. Either analyze server log files or use an APM Solution such as Dynatrace that allows us to get a full end-to-end view of each individual requests.

Summarizing the Application Server-Side Processing Time – we have

  • 10 Requests that seem to hit an application server consuming a total of 3.6s on the server and return ~800kb of data
    • Solution: Analyze server-side processing and tweak performance by following best practices such as reducing roundtrips to the database, reducing remoting calls, optimize synchronization and memory usage. There are plenty of articles on this blog – I definitely recommend to read the 2010 Performance Almanac from Alois
    • Potential Savings: based on our experience with our clients performance can be increased 3-fold by following the server-side performance best practices. You have to have the appropriate tools for a detailed analysis and you should start with your performance optimization efforts early on in the project – don’t start in production 🙂 – Listen in to some Best Practices of our clients such as Zappos, Insight, Monster or SmithMicro

JavaScript Execution

I guess I don’t need to talk about the importance of JavaScript in modern web applications. As great and important as JavaScript and AJAX are you have to make sure to use it efficiently because JavaScript execution adds to the overall end-user experience. I already blogged about different best practices on things like correct usage of CSS Selectors with jQuery and Prototype and showed the performance impact of bad JavaScript code and incorrect jQuery usage by analyzing sites like vancouver2010.com or analyzing internals of JavaScript menus.

There are two areas I focus on when analyzing JavaScript executions: slow JavaScript handlers and usage of jQuery methods. I start by looking at the Timeline View and identify big blocks of JavaScript handler executions. A double-click on one of these blocks I get to the PurePath view that shows me what has actually been executed. The following illustration shows this exercise on the Matches page:

2s spent in CSS Selector Lookup by Class Name. This method of lookup is always slow in IE
2s spent in CSS Selector Lookup by Class Name. This method of lookup is always slow in IE

It seems that almost all of the time of this inline JavaScript is spent in the CSS Lookup. Please read the jQuery blog that explains why this method is slow in Internet Explorer. This is our first big problem on this page. Next step is to focus on jQuery calls in general. I open the HotSpot View and filter it to show methods that include “$(” and then sort it by the Total Sum column. I get a list of all jQuery Selector calls, how often they get called and how much time is spent in these alls. REMARK: In order to get the jQuery Selector Arguments you have to turn on Argument Capturing in the Dynatrace AJAX Edition:

Unnecessary jQuery Lookup Calls and many Lookup calls by getElementsByClassName
Unnecessary jQuery Lookup Calls and many Lookup calls by getElementsByClassName

When double clicking on a method the HotSpot View shows us the Back Traces (the reversed call tree). Doing this on all these $(<xy>) showed me that these calls are calls made by the implementation of getElementsByClassName. By fixing that problem we also get rid of all these. More interesting on this page is the lookup that is highlighted in the screenshot above. The method currMenuItem executes an expensive lookup of the same element 4 times resulting in 87ms execution time. 3 of these 4 calls can be saved by caching the lookup result.

Summarizing the JavaScript Execution – we have

  • 1 lookup by class name taking 2s execution time
    • Solution: instead of looking elements up by classname use a lookup by id (#) or at least specify a classname. Also – make sure to use the latest versions of your lookup framework such as jQuery – they constantly make performance improvements
    • Potential Savings: I would say we can save 99% of the execution time when switching to a lookup by ID -> that’s a great save 🙂
  • Redundant lookups
    • Solution: cache the lookup result of the first lookup call. Then reuse this value for additional operations
    • Potential Savings: In this case we can save 60ms

Overall Performance Analysis Results – What’s my Rank?

If I would need to rank this web site – similar to what YSlow and PageSpeed are doing I would have the following result:

Overall Ranking: F

  • Browser Caching: F – 175 images have a short expires header, 4 have a header in the past
  • Network: F – 201 Requests in total, 1 Redirect, 1 HTTP 400, duplicated image requests on different domains
  • Server-Side: C– 10 App-Server Requests with a total of 3.6s -> analyze server-side processing
  • JavaScript: D– Use CSS Lookups by ID instead of Class Name

Good news is that there is lots of potential to speed up this web-site by following these Best Practices 🙂

Follow up readings …

Throughout this blog I linked to different blogs and websites that you should to look into. Here are some MUST READS:

Additionally you should check out How to Get Started with Dynatrace AJAX Edition and the Webinar we had with Monster.com to better understand how Dynatrace AJAX Edition can help you analyze your website.

Andreas Grabner has 20+ years of experience as a software developer, tester and architect and is an advocate for high-performing cloud scale applications. He is a regular contributor to the DevOps community, a frequent speaker at technology conferences and regularly publishes articles on blog.dynatrace.com. You can follow him on Twitter: @grabnerandi