• Home
  • How to use Dynatrace
  • Problem detection and analysis

Problem detection and analysis

Dynatrace uses a sophisticated AI causation engine, called Davis®, to automatically detect performance anomalies in your applications, services, and infrastructure. Dynatrace-detected problems are used to report and alert on abnormal situations, such as performance degradations, improper functionality, or lack of availability (that is, problems represent anomalies in baseline system performance). Problems have defined lifespans and are updated in real time with all incoming events and findings. Once a problem is detected, it's listed on your problems feed.

Events and problems

A problem may be the result of a single event or multiple events, which is often the case in complex environments. To prevent a flood of seemingly unrelated problem alerts for related events in such environments, the Dynatrace AI correlates all events that share the same root cause into a single, trackable problem. This approach prevents event and alert spamming.

Events represent different types of individual incidents, such as metric-threshold breaches, baseline degradations, or point-in-time events, such as process crashes. Dynatrace also detects and processes informational events such as new software deployments, configuration changes, and other event types.

Impact and root cause analysis

For each detected problem, Dynatrace investigates the problem's impact and root cause. With the aid of a visual resolution path (provided for all problems that affect multiple infrastructure components), you can even replay the sequence of detected events that led up to and are correlated with any given problem. Dynatrace offers two levels of impact analysis: direct impact analysis, which details a problem's impact on user experience, and business impact analysis, which focuses on identifying any effects that a problem may have on the success of your digital business (for example, if you web site can no longer process new orders).

Impact and root-cause analysis details are presented on each dedicated problem overview page.

Raising and evaluating problems

Dynatrace continuously monitors the performance of every aspect of your applications, services, and infrastructure to automatically learn all baseline metrics and the overall health of each component in your environment, including the response times of your applications and services. Variables such as geolocation, browser type, operating system, connection bandwidth, and user actions are factored in automatically. This intelligent automated baselining allows Dynatrace to detect anomalies at a highly granular level and to notify you of detected problems in real time. You can customize the thresholds generated through automated baselining either by adapting the sensitivity of problem detection or, if necessary, by defining your own static thresholds.

Notification and alerting

Through the Dynatrace mobile app, you can receive push notifications to your preferred mobile device and gain quick insights into all problems detected by Dynatrace. You can also set up fine-grained alert-filtering rules that are based on the severity, customer impact, associated tags, and/or duration of detected problems. Lastly, you can define maintenance windows during which alerts won't be generated.

Short-lived problems do not generate open problem notifications. For such problems, Dynatrace sends out only resolved problem notifications to inform you that such a problem occurred.

Basic concepts

  • How problems are detected and analyzed
  • Problem lifecycle and important timings
  • Event types
  • How Davis detects the impact of a problem
  • View the history of open/closed problems
  • Problem overview page

Problem analysis

  • Impact analysis
  • Root cause analysis
  • Event analytics
  • Use percentiles to analyze application performance

Problem detection

  • Automated multi-dimensional baselining
  • Static thresholds
  • Adjust the sensitivity of anomaly detection
  • Prediction-based anomaly detection
  • Detection of frequent issues
  • Metric events for alerting

Problem notification and alerting

  • Maintenance windows
  • Alerting profiles
  • Push notifications via the Dynatrace mobile app