Detection and correlation of availability issues

Dynatrace intelligent problem correlation is a powerful tool used by DevOps to quickly identify and understand problematic situations in large-scale environments. To highlight the value of Dynatrace availability-problem detection we’ve added some textual explanations to event descriptions on Problem pages.

The first addition relates to Unexpected low traffic application problems. Such problems are often an indicator of availability issues on the application side. Because Dynatrace compares current traffic with traffic from 7 days earlier, Dynatrace can immediately identify unexpected drops in traffic. If your application’s browser traffic drops to zero, most likely you are suffering from a global application outage. The description of such events now includes the explicit warning: Potential application outage, traffic dropped to zero! (see screenshot below). We’ve also changed the impact count so that the Impact on real users metric reflects the anticipated amount of traffic (i.e., the number of users who visited the site 7 days earlier).

availability issues

Monitoring real user traffic is a lightweight and fast approach to monitoring site availability–particularly if you don’t have any synthetic web checks set up. During low traffic situations (or during an organization’s startup phase), Dynatrace can’t be 100% sure if a “drop to zero traffic” indicates an application outage or simply an unexpected period of low traffic while your application is up and running. For this reason, it’s a good idea to additionally run synthetic web checks to give you insights into such situations. Dynatrace problem correlation automatically merges these “traffic dropped to zero” events with related global outage events detected by synthetic monitoring. With this approach Dynatrace can differentiate between “traffic dropped to zero” events that are real application outages (see example below) and periods when your site simply has no visitors.

availability issues

To improve the visibility of backend service availability we will soon come up with a specific service availability pattern. So stay tuned for more service availability improvements.

Wolfgang is a Technical Product Manager at Dynatrace. He has a long record of research in software analytics and mobile computing. At Dynatrace, he's responsible for baseline calculation and event correlation in performance analytics. He also drives the topic of mobile app monitoring.