In my previous post I showed you why GoDaddy could sustain the peak load after its Super Bowl Ad campaign aired and why others like Kia wasted a lot of marketing money because its site availability dropped under the load. In this post, we will look at the technical details behind the impact of oversized pages, and how that may have been averted by kia.com.
Kia.com availability eventually dropping to 0% – see previous post
Lesson #1: Bloated Pages will Kill your Web Servers
Previously, I noted availability issues occurred for Kia during game time. The Kia team broke an important performance rule – large number of bytes transferred.
First, let’s look at the anatomy of an American football fumble– play in motion, ball fumbled, pile-up ensues until possession is determined, then play is recovered or turned over.
Now, let’s look at the anatomy of a heavy web page otherwise delivered successfully under normal load. First, heavy traffic volume begins. A connection pile-up ensues by putting massive campaign-based load on front-end. Front-end tipping point reached when established connections cannot complete bytes in motion faster than new connections are requested. Eventually, timeouts occur for new connections after 60 second inactivity, and the web site is barely in motion.
Kia’s page was just over 20.1MB, and ranked 51 out of the 53 advertisers for heaviest byte count. Its tools should have made this fact obvious, unless it was deliberate. Either way, soon after its 3rd quarter ad played, kia.com failed miserably
The following screenshot shows the drop in transferred bytes during game time when hitting the home page. At the same time we observed internal server errors and the socket timeouts which explain the actual problem. Web Servers couldn’t handle the incoming load because they were busy delivering the very heavy weight content. Note: In my previous blog I identified the symptoms of socket timeouts, however, further investigation for this blog revealed a more prevalent symptom of internal server errors (HTTP Error code 500):
Socket Timeout Error:
Internal Server Error 500 – yes, this is a generic error message, but given what we know about kia.com’s site performance the unexpected condition was encountered was likely due to excessive load and the servers inability to service that traffic:
Lesson #2: What Caused the Bloat?
One thing for certain with web and mobile pages, as with wide receivers, is they must be lean to be fast or it’s much harder to catch higher conversion rates. But why is a leaner page faster than a heavier one? Simple: It’s a physical law of kilobits per second (kbps) – the bigger the byte count, the longer the download time.
And why would you want to keep it slim? Consider the fact that building a leaner page is more challenging – it takes serious design considerations, collaboration, and engineering deployment decisions to deliver a page with streamlined efficiency and high customer impact. There is more to “lean” than just bytes – check out our Top Client Side Performance Landmines to learn more.
How was Kia’s page not-so-lean? The majority of content was an average of 10MB of WebM video files, and almost a 1MB of images. That’s a lot of media content for one page. And since there are only so many threads a web server can service simultaneously – the longer an established connection lasts will prevent a new connection from being established.
WebM video files comprised over 90% of the kia.com website; video object type (in dark green) are non-existent in post-ad box area.
Interestingly, Kia partnered with a CDN vendor relying on an accelerated delivery platform for their success. But, even CDNs may have difficulty delivering the most bloated pages. This is why it’s important to test the risk between a rich, captivating user experience, and the confidence of delivering it 100% of the time with partners under all load conditions.
This raises the question – what is the most effective tool Kia could have used to better understand the impact of its bloated page structure? The answer is a real-world load testing project.
This is best accomplished by testing your application delivery chain – this is the real-world, client>cloud>datacenter pathing your visitors cross when going to your live site. You do this by generating live, client-based load and hammer your site with 1.5-3x of your expected highest-per-minute peak traffic. Why? When you adopt load testing that starts with the end user point of view, you have a better chance to eliminate any blind spots in the round-trip to your infrastructure and application tiers, including your CDN partners.
Proper load testing is akin to a crash-test dummy. Before you buy that new car, the manufacturer had conducted crash-tests so it can state facts about a car’s safety rating with confidence. Consider approaching your web site in the same way by conducting a real-work load test(s) to establish your own go-live rating with confidence.
Takeaways for Developers
Takeaways for Performance Engineers
- Load Test 1.5x – 3x of your expected load. Make sure to test from different regions and browsers.
Takeaways for Business / Marketing
- Make sure the advertising dollars spent matches the money it will take to be sure that IT and Dev get it right.
In my next post I will help answer these questions: is deploying SSL on a homepage a security requirement, a marketing tactic, or simply a misunderstood technology? And, is the importance to not only include your CDN Partner in your testing and monitoring Strategy, but extend this to also analyze which CDN partner is the right one for you.