Outage Analyzer showcases the value of “Big Data” performance analytics delivering answers instead of just more data with real-time visualizations and alerts of outages in third-party web services. Outage Analyzer harnesses the collective intelligence of the Compuware Gomez Network. With more than eight billion measurements a day across the global Internet, determining whether an application performance issue is the fault of an organization’s code, or the fault of a third-party service has never been made easier.
The initial map view highlights the location of the datacenter where the failing third-party service is hosted and visually shows in which geographies end-users are impacted. It also identifies the number of potentially affected domains and the confidence that the system has that this is a serious outage.
Figure 1: Outage Captured by Outage Analyzer
Under the Outage Analyzer map view lays an extremely rich dataset and a Big Data performance analytics platform, providing information on key patterns of third-party content use. By passing a constant stream of measurement data through the specially designed system, the Outage Analyzer proprietary processing engine, which is designed to detect patterns and learn the interrelationship between the various cloud providers, CDNs, and other third-party services, has uncovered some very interesting trends.
Taking one day of data from the Outage Analyzer system and grouping the processed objects into a number of categories, we quickly discovered that websites today are heavily reliant on third-party services to provide Ad Serving and Web Analytics – with these two categories alone comprising 73% of the objects processed in a 24-hour period.
Figure 2: Third-party Content – By Type
While the presence of many of the categories was expected, the most surprising data was the appearance of a class of service that falls under the category of Ad Verification. While only making up 2% of the total objects, this type of third-party web service is one that we plan to watch for growth over the next few months.
Outage Analyzer was also able to identify over 1,500 unique third-party web services that were used, providing a clear indication of the importance that understanding the third-party web components that are providing content for sites and how these services affect the overall performance and user experience.
The third area that Outage Analyzer exposed was the number of object failures that were encountered by the system in a single day. As can be seen below, these object failure numbers further highlight the need to completely understand where all the content on a page originates, and how its behavior can affect overall site performance.
Figure 3: Object Failures by Category
Between the two largest categories – Ad Serving and Web Analytics – over 59,000 object failures were recorded in a 24-hour period. Even if the failure of these objects did not affect the overall customer experience, they did lead to situations where ad impressions were lost and visitor analytics were not collected.
With Outage Analyzer, we can now see third-party issues in their entirety, helping our customers determine, with increased certainty, whether the issue is only affecting their site or if it is regional, national, or even global in scope.
This first view into the deeper patterns that Outage Analyzer has exposed critical information from within the reams of web performance data collected every day, cutting through the data noise. In further analyses, we’ll dig deeper into how outages spread and how an outage with one service can cascade to affect other services.