• Home
  • Extend Dynatrace
  • Send data to Dynatrace with OpenTelemetry
  • OpenTelemetry metrics
  • OpenTelemetry metric concepts

OpenTelemetry metric concepts

A metric represents a data series over time with a predefined aggregation and set of attributes. It can show information about the execution of a program at any given time. The collected metrics can then be summarized and utilized through various categories.

The OpenTelemetry metric signal aims to:

  • Connect metrics with other signals (for example, traces)
  • Provide a path for OpenCensus users to migrate
  • Work together with existing metric solutions

API and SDK separation

  • The Metrics API is responsible for capturing raw measurements and decoupling the instrumentation from the SDK.
  • The Metrics SDK is responsible for implementing the API, as well as for providing functionality and extensibility (such as configuration, processors, and exporters).

This separation allows for the configuration of different SDKs at runtime.

Programming model

The following programming model is based on the OpenTelemetry official documentation:

Programming model

MeterProvider

A MeterProvider is the only component of metric signals accessible by both the API and SDK. At its core, the MeterProvider is responsible for creating Meter instances.

Since the API and SDK are separated, and they can both access the MeterProvider, they have slightly varying uses for it:

  • The API provides a way to set, register, and access a global default MeterProvider, which also holds the configuration (some applications might want to use multiple MeterProvider instances for different configurations).
  • Within the SDK, the MeterProvider is responsible for allowing a Resource to be specified. This Resource should be associated with all the metrics produced by any Meter from the MeterProvider.

API

The Metrics API has three main components:

  1. MeterProvider, which is the entry point of the API, provides access to Meters
  2. Meter, which is the class responsible for creating Instruments
  3. Instrument, which is responsible for reporting Measurements

Meter

The Meter is responsible for creating Instruments.

It does not hold any configuration.

Note that the names of the Instruments under one Meter should not interfere with Instruments under another Meter.

The Meter can create the following Instruments:

InstrumentAsync?ParametersDescriptionExample
CounterNoname (required), unit, descriptionReports non-negative incrementsnumber of bytes received, number of accounts created
Asynchronous CounterYesname (required), unit, description, callbackReports monotonically increasing valuesCPU time for either each thread, each process, entire system
HistogramNoname (required), unit, descriptionReports arbitrary values that are likely to be statistically meaningfulrequest duration, size of response payload
Asynchronous GaugeYesname (required), unit, description, callbackReports non-additive valuesroom temperature, CPU fan speed
UpDownCounterNoname (required), unit, descriptionSupports increments and decrementsnumber of active requests, number of items in a queue
Asynchronous UpDownCounterYesname (required), unit, description, callbackReports additive valuesprocess heap size, number of items in a lock-free circular buffer

Instruments

The instruments created by the Meter:

  • Counter
  • Asynchronous Counter
  • Histogram
  • Asynchronous Gauge
  • UpDownCounter
  • Asynchronous UpDownCounter

Counter

A Counter is a synchronous Instrument that supports non-negative increments.

It can be used to count, for example, the number of bytes received, accounts created, or checkpoints run.

  • Parameters: name (required), unit, and description.
  • Does not support negative increments.
    • Single function: Add (increments the Counter by a fixed amount).
    • This function also supports attributes.
python
exception_counter = meter.create_counter(name="exceptions", description="number of exceptions caught", value_type=int)

Asynchronous Counter

An Asynchronous Counter reports monotonically increasing values. That could be CPU time for a specific process or for the entire system.

  • Parameters: name (required), unit, and description.
    • Asynchronous instruments can carry a callback function as a parameter. The callback function reports Measurements only when the Meter is being observed.
  • Recommendation do not provide more than one Measurement with the same attributes in a single callback function.
  • Recommendation implementations should use the name ObservableCounter (or language-specific variations, such as observable_counter).
python
def pf_callback(result): # Note: in the real world these would be retrieved from the operating system result.Observe(8, ("pid", 0), ("bitness", 64)) result.Observe(37741921, ("pid", 4), ("bitness", 64)) result.Observe(10465, ("pid", 880), ("bitness", 32)) meter.create_observable_counter(name="PF", description="process page faults", pf_callback)

Histogram

A Histogram is an approximate representation of the distribution of numerical data.

In OpenTelemetry, a Histogram is a synchronous Instrument that can be used to report arbitrary values that are likely to be statistically meaningful, such as the request duration, or the size of the response payload.

  • Parameters: name (required), unit, and description.
  • Intended for statistics.
python
http_server_duration = meter.create_histogram( name="http.server.duration", description="measures the duration of the inbound HTTP request", unit="milliseconds", value_type=float)

Asynchronous Gauge

An Asynchronous Gauge reports non-additive values when the Instrument is being observed (I.e., via callback function), specifically those that do not make sense to sum up.

For example, there could be data about the temperature in different rooms, or the CPU fan speed.

  • Parameters: name (required), unit, and description.
    • Asynchronous instruments can carry a callback function as a parameter. The callback function reports Measurements only when the Meter is being observed.
  • These measurements are looked at individually, or as an average, a maximum or minimum, but not summed up.
  • AsynchronousGauge does not have a synchronous counterpart.
  • Recommendation do not provide more than one Measurement with the same attributes in a single callback.
  • Recommendation if you wish to report a sum of any kind, use a Counter instead.
python
def cpu_frequency_callback(): # Note: in the real world these would be retrieved from the operating system return ( (3.38, ("cpu", 0), ("core", 0)), (3.51, ("cpu", 0), ("core", 1)), (0.57, ("cpu", 1), ("core", 0)), (0.56, ("cpu", 1), ("core", 1)), ) meter.create_observable_gauge( name="cpu.frequency", description="the real-time CPU clock speed", callback=cpu_frequency_callback, unit="GHz", value_type=float)

UpDownCounter

An UpDownCounter is a synchronous Instrument that supports increments and decrements. An UpDownCounter can be used, for example, for the number of active requests or number of items in a queue.

  • Parameters: name (required), unit, and description.
  • Intended for scenarios where the absolute values are not pre-calculated.
  • Recommendation if it is pre-calculated, or fetching the current value is straightforward, use Asynchronous UpDownCounter instead. If the value increases monotonically, use a Counter instead.
python
items = [] meter.create_observable_up_down_counter( name="store.inventory", description="the number of the items available", callback=lambda result: result.Observe(len(items)))

Asynchronous UpDownCounter

An Asynchronous UpDownCounter is an asynchronous Instrument that reports additive values, for example, the process heap size.

  • Parameters: name (required), unit, and description.
    • Asynchronous instruments can carry a callback function as a parameter. The callback function reports Measurements only when the Meter is being observed.
  • Recommendation if the value increases monotonically, use Asynchronous Counter instead. If the value is non-additive, use Asynchronous Gauge instead.
  • Recommendation do not provide more than one Measurement with the same attributes in a single callback function.
python
def ws_callback(result): # Note: in the real world these would be retrieved from the operating system result.Observe(8, ("pid", 0), ("bitness", 64)) result.Observe(20, ("pid", 4), ("bitness", 64)) result.Observe(126032, ("pid", 880), ("bitness", 32)) meter.create_observable_updowncounter( name="process.workingset", description="process working set", callback=ws_callback, unit="kB", value_type=int)

Measurement

A measurement represents a data point reported via API to the SDK. It has a value and can carry attributes.

Exemplar

An exemplar is an example data point for aggregated data.

It provides OpenTelemetry context to a metric event within a Metric, meaning it includes the request’s trace ID in the labels.

This allows users to link trace signals with Metrics, enabling correlation between a metric event and the API call where measurements are recorded.

An exemplar consists of the trace associated with the recording (trace and/or span ID), the time of observation, the recorded value, and a set of filtered attributes (insight into the Context when the observation was made).

SDK

The Metrics SDK is responsible for implementing the API, as well as for providing functionality and extensibility, such as configuration, processors, and exporters.

View

A View is useful for metric customization and can be registered with a MeterProvider.

With a view, you can:

  • Choose which Instruments should be processed or ignored. For example, in a case where both temperature and humidity can be processed, the developer might need to process only the temperature.
  • Change the aggregation. For example, a developer might want only the total count of outgoing requests instead of request duration as Histogram by default.
  • Choose which attributes are reported on metrics. For example, a developer might care only about the HTTP status code of a request, instead of the method. In other cases, a developer might not want any attributes at all.
python
# Counter X will be exported as delta sum # Histogram Y and Gauge Z will be exported with 2 attributes (a and b) meter_provider .add_view("X", aggregation=SumAggregation()) .add_view("*", attribute_keys=["a", "b"]) .add_metric_reader(PeriodicExportingMetricReader(ConsoleExporter()), temporality=lambda kind: Delta if kind in [Counter, AsyncCounter, Histogram] else Cumulative)

Aggregation

An Aggregation is configured via the view and controls how to compute incoming Measurements into Aggregated Metrics. It specifies an operation (such as Sum, Histogram, Min, or Max) and (optional) configuration parameter overrides.

The operation’s default configuration parameter values will be used unless overridden by optional configuration parameter overrides.

Default aggregations:

  • All types of Counter default to Sum Aggregation
  • Gauge defaults to Last Value Aggregation
  • Histogram defaults to Histogram Aggregation
python
# Use Histogram with custom boundaries meter_provider .add_view( "X", aggregation="ExplicitBucketHistogram", aggregation_params={"Boundaries": [0, 10, 100]} )

MetricReader and MetricExporter

A MetricReader can be registered on the MeterProvider (multiple instances can be registered). It collects metrics from the SDK on demand and handles ForceFlush as well as Shutdown signals from the SDK. When constructing a MetricReader, it must be provided with a MetricExporter as well.

A MetricExporter is used to send the metrics to a backend. There are two basic kinds of exporter:

  • A push exporter sends data based on a pre-configured interval (or in case of severe error)
  • A pull exporter reacts to a scraper to send its data

Like a MetricReader, multiple MetricExporter instances can be configured on the same MeterProvider, with one MetricReader for each MetricExporter.

Semantic conventions

Apply the following guidelines when creating names for your metrics:

Consistency

Be consistent when naming metrics and attributes, which includes nesting associated metrics in a hierarchical structure and sticking to consistent naming for common attributes.

Name reuse

Avoid reusing names of metrics with the same name that existed in the past but was renamed, wherever possible.

Units

Metrics that have their unit included in the OpenTelemetry metadata should not carry the unit in their name.

Pluralization

The name of a metric should only be pluralized if the unit of the metric in question is a non-unit (such as operations or packets).

For topic-specific semantic conventions, see Metrics Semantic Conventions.

Select a correct instrument

Choosing the correct instrument to report Measurements is critical to achieving better efficiency, easing consumption for the user, and maintaining clarity in the semantics of the metric stream.

OpenTelemetry documention provides a way of choosing the correct instrument.

Based on your intention, you can apply the following guidelines:

I want to count something

To count is to record the delta value.

  • If the value is monotonically increasing (the delta value is always non-negative), use a Counter.
  • If the value is NOT monotonically increasing (the delta value can be positive, negative or zero), use an UpDownCounter.

I want to record or time something

If you expect that the collected statistics are meaningful, use a Histogram

I want to measure something

To measure is to report an absolute value.

  • If it makes NO sense to add up the values across different sets of attributes, use an Asynchronous Gauge.
  • If it makes sense to add up the values across different sets of attributes:
    • If the value is monotonically increasing, use an Asynchronous Counter.
    • If the value is NOT monotonically increasing, use an Asynchronous UpDownCounter.