Metric selector in custom metric events
The metric selector is a powerful tool for querying your data. It provides you two major possibilities:
- Metric transformations for transforming the metric.
- Metric expressions for combining one or more metrics into a different result by means of simple mathematics.
In this example, we want to detect anomalies on the combined incoming and outgoing network traffic by calculating the sum of all bytes read (
builtin:host.net.bytesRx) and written (
builtin:host.net.bytesTx). The metric expression for that is:
This expression evaluates to a single metric result that Davis will use to learn a baseline and to detect and alert on anomalies.
A metric selector can consist of thousands of individual metric measurements. It is important to understand the implications when configuring a selector that consists of measurements coming from thousands of individual sources. Dynatrace applies safety limits to anomaly detection in terms of the number of metric dimensions that can be observed within one monitoring environment to avoid any operational issues.
While the idea of checking the metric used in a custom metric event via the Data explorer seems straightforward, there are some drawbacks to this approach.
The Data explorer applies some transformation functions for charting:
- Applies the metrics default aggregation
- Merges all measurements into a single line
- Sorts the results by value
- Limits the result to 10
These operations are the equivalent of the following metric expression:
While those transformations improve usability for charting and exploring the data, they are misleading in terms of continuous anomaly detection.
The split by transformation is used to merge all dimensions into one. If improperly used, this can lead to the unintentional hiding of thousands of data points within a single expensive anomaly detection configuration.
The Sort and limit transformations are risky as they not only introduce an expensive operation on top of the raw data, they also introduce non-deterministic behavior within anomaly detection. Both functions should be avoided in any anomaly detection configuration, as during runtime those top-sorted results can change every minute.
Applying a sorted limit of 10 results within anomaly detection will result in a behavior that monitors a potentially different set of 10 lines every couple of minutes.
This non-deterministic alerting behavior is even worse when configuring a baseline, as the learned baseline value for the last top-10 results will not be the same for the next evaluation run. As a result, you will end up with thousands of learned baselines that are not applicable to the current top 10.
Combining metrics for anomaly detection
With the power of a metric expression, you can implement alerting with a top-down view of a situation rather than alerting on each individual component.
For example, you can observe log patterns across multiple hosts. By calculating the total count of observed log patterns across all relevant log files, Dynatrace can detect pattern anomalies on the accumulated log stream rather than on the individual counts per log file.
In case of sparse counts across many entities (for example, an error count across multiple processes of the same type), aggregated top-down anomaly detection is much more resilient against false-positive alerts compared to detection on an individual error count per process.
Metric events based on a metric selector support topology awareness. The resulting mapping depends on the data granularity of the result.
Metric selectors that are split by an entity persist that mapping and are topology-aware. The events raised on such metrics are mapped to the original source.
When metric selectors result in a single aggregated series, with no clear entity and topology reference, the events raised on such metrics are mapped to the global monitoring environment.
Override topology mapping
You can override automatic selection of the entity type the events are mapped to. Be aware that you should select only entity types that are referenced in the incoming metric measurements. When an entity type is selected where the metric does not show the necessary dimension, the entity override is ignored.
To override the automatic entity type, in the metric event configuration, expand Advanced entity settings and select the required entity type.