Metrics
Metrics reflect numerical data points or measurements in your system, where the system can be any component in your software stack, ranging from a small, specific part of code in your application to something broader such as an underlying runtime engine, and even to the whole host machine and the operating system itself.
For example:
- An application could define a metric that represents the number of requests it received.
- A runtime could track its memory allocation and expose a metric detailing how much heap memory is free and how much is allocated.
- The host machine and operating system could record the utilization of each processor (along with its temperature), the number of processes, and inbound and outbound network traffic.
There are different metric instruments with different use cases, purposes, and limitations.
Metric instruments
OpenTelemetry supports the following four main instrument types:
- Counter (synchronous and asynchronous)
- UpDownCounter (synchronous and asynchronous)
- Histogram (synchronous only)
- Gauge (asynchronous only)
Synchronous and asynchronous instruments differ as to how and when their values are recorded. (This is unrelated to asynchronous programming.)
-
Synchronous instruments collect data on demand and are actively invoked as part of your business logic. They essentially are called whenever you say there's an update.
-
Asynchronous instruments do not actively update their values but require you to provide a callback function, which is called at the discretion of OpenTelemetry and needs to provide the relevant information. Asynchronous instruments basically follow the observer pattern.
For details, see Synchronous and Asynchronous instruments in the OpenTelemetry documentation.
When creating a new instrument, it is important to decide on the right type (and for counters, decide on synchronous or asynchronous) and provide the following descriptive elements:
- A name
- A unit (for example, kilobytes)
- An optional description
Each instrument comes with a default aggregation type and monotonicity mode.
Counter
A counter, the most basic instrument type, monotonically records exclusively positive values. It accepts a positive value and adds its current value.
Counters are typically used to record values such as the number of bytes sent or received over the network, the number of user accounts created, and the number of performed database queries.
A counter can be used synchronously or asynchronously.
UpDownCounter
An up-down counter follows the same logic as a regular counter, with the difference being that it allows negative values: you can add -1
to its value and it will decrement the overall value.
Up-down counters are typically used to record values such as the number of active requests or the number of items in a queue.
An up-down counter can be used synchronously or asynchronously.
Histogram
A histogram does not record individual values, but rather aggregates/groups them according to their distribution within the value range. Use a histogram if you are mostly interested in statistical occurrence.
A typical histogram could be the distribution of response times and how many requests took less than a second.
A histogram can only be used synchronously.
Gauge
Unlike counters, a gauge does not sum up values but records them individually instead.
Typical gauge uses are the memory used by your application or the processor temperatures.
A gauge can only be used asynchronously.
Instrument summary
Instrument | Description | Sync/Async | Default aggregation | Monotonicity |
---|---|---|---|---|
Counter | Records and sums up positive values | Either | Sum | Monotonic |
UpDownCounter | Records and sums up positive and negative values | Either | Sum | Non-monotonic |
Histogram | Records the distribution of values across their value range | Sync | Bucket | Monotonic |
Gauge | Records individual, non-additive values | Async | Last value | Non-monotonic |
Exemplars
Exemplars provide a way to associate individual metric values with traces and spans.
When you have exemplars enabled (depending on the SDK you are using, they may be disabled by default) and create a new measurement for a metric in the context of an active span, that measurement is linked to that span, along with the following information:
- The value
- The time
- Any additional metric attributes not already configured by the default view
- The trace and span ID
Exemplars, and their SDK setup, can be quite a complex topic, so be sure to see the OpenTelemetry exemplar specification for full details.
Views
A view is a type of post-processing mechanism for metrics that allows for adjustments and transformations of metrics before they are published.
With views, you can:
- Exclude instruments from being published (for example, an application publishes the temperature and humidity when only the temperature is required)
- Customize the default aggregation of an instrument (for example, a histogram is used to measure request durations when only the total count of requests is required)
- Change attributes of individual metrics (for example, an HTTP metric contains the request method and the status code when only the status code is required)
The configuration and setup of views highly depend on the used SDK. For details, see the OpenTelemetry view specification.
Aggregation
The aggregation type of an instrument is configured in an applicable view and defines how recorded values are handled collectively.
Each instrument comes with a default aggregation, which can be overridden with a custom view.
Operations
- Sum—sums up the recorded values for the selected time period
- Last Value—only uses the last value provided in the selected time period
- Histogram—computes statistics based on the distribution of the values
- Drop—all values should be dropped/ignored
Temporality
When exporting to Dynatrace, it is important to always use the delta temporality.
See OpenTelemetry Metrics Limitations for details.
The temporality of an aggregation specifies whether previously recorded values of an instrument continue to be incorporated into subsequent reports. For this purpose, two temporality types are available:
- Cumulative—published values reflect the total, cumulative sum since the creation of the instrument
- Delta—published values reflect only the difference since the last published value
Configure the temporality type with the respective SDK functions or the environment variable OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE
.
For details, see the OpenTelemetry temporality specification.
Dynatrace mapping
Instrument | with temporality | maps to Dynatrace |
---|---|---|
Counter | Delta | Counter |
Counter | Cumulative | N/A |
UpDownCounter | Delta | Counter |
UpDownCounter | Cumulative | Gauge |
Gauge | N/A | Gauge |
Dynatrace does not currently support histograms yet.