Davis® AI analysis terminology
An alert is an active notification via a third-party system (for example, PagerDuty or Slack) that aims to get the immediate attention of a human user.
The concept of alerting is independent of events or problems—you can receive alerts about different events in Dynatrace, such as problems or detected security vulnerabilities.
An anomaly is a situation within a data set that's classified as outside the normal range. Dynatrace detects abnormal behavior using various detection models, such as static thresholds, auto-adaptive baselines, or more complex statistical or AI models.
An auto-adaptive baseline represents a methodology that uses a means to automatically retrain a baseline. Automatic retraining methods can be based on a scheduler, a business calendar, or any other trigger, such as successful load tests or software deployments. The current implementation of the auto-adaptive and seasonal baselining for metric-based events uses a fixed schedule.
In statistics, a baseline represents the starting point for a comparison and is used to identify abnormal behavior.
In Dynatrace, a baseline is the observed and learned typical behavior of a time series in a specified time range, regardless of what the time series represents (for example, a measure of service response time or a host's CPU consumption).
A baseline represents the "expected" behavior and is often used as a basis for detecting abnormal signal ranges.
Alerting with baselines typically involves a range with a maximum threshold (for example, two times the measured variance of the signal above the learned baseline) to alert on abnormal behavior.
A correlation represents a statistical relationship between two variables (mostly time series). It measures how closely the variables are related, with no reference to causal information.
A strong correlation between two variables means that both variables show a similar or identical behavior over time.
Note that causal information, such as "variable A is the cause of the change to variable B," cannot be derived from a correlation alone.
A Davis score is a number from
1.0, attached to an item in a list of analysis results. Items with a high score rank above items with a low score.
Similarity scores are variable depending on context. An entity's score changes based on the comparison set included in your analysis.
An event is any time-stamped occurrence that has significance for your IT infrastructure. Custom-configured alerts for business KPIs (business events) and raw measurements of any single metric (metric key events) are also just events.
Dynatrace considers any time-stamped raw data that's detected by OneAgent, Extensions, or API to be an event. So, in addition to system events like restarts and service outages—which trigger time-stamped data—logs and spans can also qualify as raw events in Dynatrace because they generate time-stamped data. All events are reported with the date and time of their occurrence.
This definition is in line with the ITIL Event Management definition of the term "event."
An event can be defined as any detectable or discernible occurrence that has significance for the management of the IT Infrastructure or the delivery of IT service and evaluation of the impact a deviation might cause to the services. Events are typically notifications created by an IT service, Configuration Item (CI), or monitoring tool.
An incident is an unplanned interruption to an IT service or reduction in the quality of an IT service, or a failure of a Configuration Item that has not yet impacted an IT service (for example, failure of one disk in a disk-mirrored set). This definition is in alignment with the ITIL (Information Technology Infrastructure Library) definition.
In contrast to automatically detected events and problems, incidents are mostly handled by human operators who must manually evaluate and close incidents.
A measurement is a single value with a timestamp, typically part of a time series. Each measurement is comprised of a timestamp, metric name, and measurement value. For example,
20220825-12:55:60, CPU Usage, 99%.
A measurement can also include a set of dimensions that distinguish it from other measurements of the same metric for different aspects of the same entity.
A metric is a group of measurements of the same type and semantics that are collected by the same instrument.
A metric contains multiple time series that share the same metadata, such as the metric name and identifier (for example,
CPU usage) or the unit (for example,
A notification is a means of delivering a message to a user. Typically, software notifications are used to draw a user's attention to time-sensitive information, such as an incoming message. In Dynatrace, the built-in notification center proactively informs you of essential requirements of your Dynatrace environment, such as upgrading OneAgent or renewing a license.
While the terms "notification" and "alert" are often interchanged, they're not the same. Notifications are often related to non-critical, user-interface-driven details. In contrast, alerts inform users or third-party systems about critical problems (like the outage of an application or the crash of a web server).
In Dynatrace, a problem is a detected and analyzed abnormal situation. Davis detects and analyzes problems for root cause analysis. A problem typically encompasses one or multiple events, metrics, logs, and traces that Davis identifies as causally relevant to the problem.
The Dynatrace definition of a problem differs from the definition specified by the ITIL (Information Technology Infrastructure Library).
A seasonal baseline is an enhanced baseline model that can cope with seasonal variability in data, such as daily and weekly business hours. Seasonal baseline models go beyond learning a single baseline value for a variable and instead extract and train a more complex model that varies depending on the detected seasonality.
In software observability, a signal refers to a piece of data generated by a system or application and can be used to monitor, diagnose, or troubleshoot its behavior. A signal can take many forms: logs, metrics, traces, events, or alerts. It can indicate various aspects of the system, such as its performance, availability, errors, dependencies, or security.
Signals are typically collected and aggregated by observability tools, which allow operators or developers to visualize, analyze, and correlate them across different components or layers of the system. By observing and interpreting signals, stakeholders can gain insights into the system's state and behavior, detect anomalies, identify root causes or correlations, and make data-driven decisions to optimize or improve the system.
A threshold is a measurement delineation beyond which all measurements are considered outside the range of normal behavior. Depending on a system's configuration, all values above a maximum threshold (ceiling) or below a minimum threshold (floor) are considered abnormal. Thresholds provide a simple model for classifying measurements and, thereby, detecting abnormal behavior.
While a threshold represents a binary classification, multi-threshold configurations introduce more than one class. Typical examples of multi-threshold classifications are SLOs (Good, Warning, and Bad) and Apdex (Good, Acceptable, and Bad user experience).
A time series is a series of data points that are indexed, listed, or graphed in time order. Dynatrace defines a time series as a sequence of measurements that are taken at regular intervals. Therefore, a time series in Dynatrace is a sequence of discrete-time data signals.