Create Prometheus extension

Dynatrace provides you with a framework that you can use to extend your application and services observability into data acquired directly from Prometheus. The Dynatrace extensions framework can pull Prometheus metrics from the /metrics endpoint, a Prometheus API endpoint, or a data exporter (Prometheus target).

Note that Dynatrace provides out-of-the-box support for ingesting metrics from Prometheus exporters in Kubernetes and an ActiveGate extension for ingesting metrics from Amazon Managed Service for Prometheus.

You can run Prometheus extensions right on the Prometheus host where you installed OneAgent, so your metrics are automatically enriched with host-specific dimensions. If, however, you can't install OneAgent on the Prometheus host, you can run extensions remotely and execute them on an ActiveGate group of your choice.

We assume the following:

Prerequisites and limits

Be sure to review all prerequisites and limits.

Supported Dynatrace versions

  • Dynatrace version 1.225+
  • ActiveGate version 1.225+
  • OneAgent version 1.225+ (local extensions)

Limits

For limits applying to your extension, see Extensions 2.0 limits and the following Prometheus-specific limits:

  • You can pull metrics from a maximum of 1,000 Prometheus endpoints
  • Maximum 1,000 metrics definitions
  • Maximum 50 dimensions per metric
Volatile dimensions

Note that a large number of dimensions can exceed the limits and impact your Dynatrace environment performance beyond its capacity. Consider that:

  • Prometheus labels automatically become Dynatrace dimensions.
  • Certain metrics can be assigned to dimensions with a constantly increasing set of values, each of them becoming a new dimension.

Define data scope

Create an inventory of Prometheus endpoints you'd like to reference in your extension, as well as metrics and dimension values.

In our example, we create a simple extension collecting Rabbit MQ metrics.

Download
name: com.dynatrace.extension.prometheus-rabbitmq
version: 1.0.0
minDynatraceVersion: "1.225"
author:
  name: Dynatrace

dashboards:
  - path: "dashboards/dashboard_exporter.json"

alerts:
  - path: "alerts/alert_socket_usage.json"

# Extension based on official rabbitmq prometheus exporter available metrics
# list of metrics visible here https://github.com/rabbitmq/rabbitmq-server/blob/master/deps/rabbitmq_prometheus/metrics.md
prometheus:
  - group: rabbitmq metrics
    interval: 1m
    featureSet: all
    dimensions:
      - key: rabbitmq
        value: const:rabbitmq
    subgroups:
      # global counters
      - subgroup: rabbitmq global counter
        dimensions:
          - key: global_counters
            value: const:global_counters
        metrics:
          - key: com.dynatrace.extension.prometheus-rabbitmq.global.global_messages_acknowledged_total
            value: metric:rabbitmq_global_messages_acknowledged_total
            type: count
            featureSet: global
          - key: com.dynatrace.extension.prometheus-rabbitmq.global.global_messages_confirmed_total
            value: metric:rabbitmq_global_messages_confirmed_total
            type: count
            featureSet: global
          - key: com.dynatrace.extension.prometheus-rabbitmq.global.global_messages_delivered_consume_auto_ack_total
            value: metric:rabbitmq_global_messages_delivered_consume_auto_ack_total
            type: count
            featureSet: global

Your Prometheus monitoring scope definition starts with the prometheus YAML node. All the settings under the node pertain to the declared data source type (in this case, Prometheus).

Dimensions

For each level (group, subgroup), you can define up to 25 dimensions (which gives you a total of 50 dimensions per metric).

Dimension key

The dimension key string must conform to the metrics ingestion protocol.

Note: For Dynatrace versions 1.215 and 1.217, a dimension node requires the id parameter in place of 'key'. Starting with Dynatrace version 1.219, it is recommended to use the key parameter, as id will be considered as deprecated.

Dimension value

You can use the following methods to define dimensions for your metrics:

  • Plain text. Prefix with const: or simply add the required text
    dimensions:
    - key: extension.owner
      value: const:Joe.Doe@somedomain.com
    
    or
    dimensions:
    - key: extension.owner
      value: Joe.Doe@somedomain.com
    
  • Prometheus label
    dimensions:
     - key: customdimension.job
       value: label:job
       filter: const:$eq(prometheus)
    
    All the labels exposed by Prometheus are created as dimensions automatically. You only need to explicitly define a label-based dimension if you want to:
    • apply filtering on the values,
    • define a custom dimension key.

Filter extracted metric lines

When extracting metric lines, you can add filtering logic that will result in reporting only the lines for which the dimension value matches the filtering criteria.

  • Report dimensions only for values that start with the string of your choice
    filter: const:$prefix(xyz)
    
  • Report dimensions only for values that end with the string of your choice
    filter: const:$suffix(xyz)
    
  • Report dimensions only for values containing a string of your choice
    filter: const:$contains(xyz)
    
  • Report dimensions only for values that are equal to the string of your choice
    filter: const:$eq(xyz)
    

You can create complex filters by combining two or more filters separated by commas using logical expressions:

  • $or() At least one of the given filters matches
  • $and() All of the provided filters match
  • $not() The filter doesn't match

For example:

dimensions:
      - key: technology
        value: other
      - key: job
        value: label:job
        filter: const:$or($eq(),$not($or($eq(prometheus),$eq(rabbitmq-server),$eq(redis_exporter),$eq(node_exporter))))

Metrics

For each level (group, subgroup), you can define up to 100 metrics. Note, however, that there is a hard limit of 1,000 metrics per extension applied at runtime. This limit is lower than the combined limits of allowed groups and subgroups.

For example:

prometheus:
  - group: rabbitmq metrics
    interval: 1m
    featureSet: all
    dimensions:
      - key: rabbitmq
        value: const:rabbitmq
    metrics:
      - key: com.dynatrace.extension.prometheus-rabbitmq.global.global_messages_acknowledged_total
        value: metric:rabbitmq_global_messages_acknowledged_total
        type: count
      - key: com.dynatrace.extension.prometheus-rabbitmq.global.global_messages_confirmed_total
        value: metric:rabbitmq_global_messages_confirmed_total
        type: count

Metric key

The metric key string must conform to the metrics ingestion protocol.

Note: For Dynatrace versions 1.215 and 1.217, a metric node requires the id parameter in place of key. Starting with Dynatrace version 1.219, we recommend that you use the key parameter, as id will be deprecated.

Best practices for metric keys

The metrics you ingest into Dynatrace using your extension are just some of the thousands of metrics, built-in and custom, processed by Dynatrace. To make your metrics keys unique and easy to identify in Dynatrace, the best practice is to prefix the metric name with the extension name. This guarantees that the metric key is unique and you can easily appoint a metric to a particular extension in your environment.

Metric value

The Prometheus metric key from which you want to extract the metric value prefixed with metric:.

Type

The Dynatrace Extensions 2.0 framework supports metric payloads in the gauge (gauge) or count value (count) formats. For details, see Metric payload. To indicate the metric type, use the type attribute.

Metric metadata

An Extension can define metadata for each metric available in Dynatrace. For example, you might want to add the metric display name and the unit, both of which can be used for filtering in the Metrics browser.

name: custom:example-extension-name
version: 1.0.0
minDynatraceVersion: "1.218"
author:
  name: Dynatrace

metrics:
  - key: your.metric.name
    metadata:
        displayName: Display name of the metric visible in Metrics browser
        unit: Count

Feature set

Feature sets are categories into which you organize the data collected by the extension. You can define feature sets at the group, subgroup, or metric level. In this example, we create a Prometheus extension collecting application and network metrics. This is reflected by metrics organization into related feature sets prometheus_app_metrics and prometheus_net_metrics.

prometheus:
  - group: prometheus metrics
    interval: 1m
    metrics:
      - key: com.dynatrace.extension.prometheus.app
        value: prometheus.app
        featureSet: prometheus_app_metrics
      - key: com.dynatrace.extension.prometheus.net
        value: prometheus.net
        featureSet: prometheus_net_metrics

When activating your extension in monitoring configuration, you can limit monitoring to one of the feature sets.

In highly segmented networks, feature sets can reflect the segments of your environment. When you create a monitoring configuration, you can select a feature set and a corresponding ActiveGate group that can connect to that particular segment.

All metrics that aren't categorized into any feature set are considered to be default and are always reported.

Note: A metric inherits the feature set of a subgroup, which in turn inherits the feature set of a group. Also, the feature set defined on the metric level overrides the feature set defined on the subgroup level, which in turn overrides the feature set defined on the group level.

Interval

The interval at which the data measurement will be taken. You can define intervals at the group, subgroup, or individual metric level. You can define intervals with the granularity of one minute (for example, 5m). The maximum interval is 2880m (2 days, 48 hours).

prometheus:
  - group: prometheus metrics
    interval: 1m
    dimensions:
      - key: technology
        value: prometheus
    metrics:
    - key: com.dynatrace.extension.prometheus-rabbitmq.global.global_messages_delivered_get_auto_ack_total
      value: metric:rabbitmq_global_messages_delivered_get_auto_ack_total
      type: count

Note: A metric inherits the interval of a subgroup, which in turn inherits the interval of a group. Also, the interval defined on the metric level overrides the interval defined on the subgroup level, which in turn overrides the interval defined on the group level.

Monitoring configuration

After you define the scope of your configuration, you need to identify the Prometheus endpoints from which to collect data.

The monitoring configuration is a JSON payload defining the connection details, credentials, and feature sets that you want to monitor. For details, see Start monitoring.

Example payload to activate a Prometheus extension:

[
  {
    "scope": "ag_group-default",
    "value": {
      "version": "1.0.0",
      "description": "name",
      "enabled": true,
      "activationContext": "REMOTE",
      "prometheusRemote": {
          "endpoints": [
            {
              "url": "https://myPrometheusServer/metrics",
              "authentication": {
                "scheme": "basic",
                "username": "user",
                "password": "password"
              }
            }
          ]
      },
      "featureSets": [
    "myFeatureSet"
  ]
    }
  }
]

When you have your initial extension YAML ready, package it, sign it, and upload it to your Dynatrace environment. For details, see Manage extension lifecyle.

Then you can use the Dynatrace API to download the schema for your extension that will help you create the JSON payload for your monitoring configuration.

Use the GET an extension schema endpoint.

Issue the following request:

curl -X GET "{env-id}.live.dynatrace.com/api/v2/extensions/{extension-name}/{extension-version}/schema" \
   -H "accept: application/json; charset=utf-8" \
   -H "Authorization: Api-Token {api-token}"

Make sure to replace {extension-name} and {extension-version} with values from your extension YAML file. A successful call returns the JSON schema.

Scope

Note that each OneAgent or ActiveGate host running your extension needs the root certificate to verify the authenticity of your extension. For more information, see Sign extension.

Remote extension

For a remote extension, the scope is an ActiveGate group that will execute the extension. Only one ActiveGate from the group will run this monitoring configuration. If you plan to use a single ActiveGate, assign it to a dedicated group. You can assign an ActiveGate to a group during installation with the --set-group-name installation parameter for Linux and Windows, or by configuring your ActiveGate.

Use the following format when defining the ActiveGate group:

"scope": "ag_group-<ActiveGate-group-name>",

Replace <ActiveGate-group-name> with the actual name.

Local extension

For a local extension, the scope is a host or a host group where you will execute the extension.

  • When defining a host as the scope, use this format:
    "scope": "<HOST_ID>",
    
    Replace <HOST_ID> with the entity ID of the host as in this example:
    "scope": "HOST-A1B2345678C9D001",
    
  • When defining a host group as the scope, use this format:
    "scope": "host_group-<HOST_GROUP_ID>",
    
    Replace <HOST_GROUP_ID> with the entity ID of the host group as in this example:
    "scope": "host_group-HOST_GROUP-AB123C4D567E890",
    
    You can find the host group ID in the host group settings page URL. For example:
    https://{your-environment-id}.live.dynatrace.com/#settings/hostgroupconfiguration;id=HOST_GROUP-AB123C4D567E890;hostGroupName=my-host-group
    

If you activate a local Prometheus extension and define the endpoint of a Prometheus Server running on the same host, the metrics gathered from that server may come from varous endpoints, not only from the endpoint on that host, but all the metrics will be enriched with the OneAgent-installed host context.

Version

The version of this monitoring configuration. Note that a single extension can run multiple monitoring configurations.

Description

A human-readable description of the specifics of this monitoring configuration.

Enabled

If set to true, the configuration is active and Dynatrace starts monitoring immediately.

Activation context

  • For remote extensions, set activationContext to REMOTE
  • For local extensions, set activationContext to LOCAL

URL

The URL is the Prometheus endpoint from which your extension pulls the metrics. The maximum URL length is 500 characters.

  • For local extensions, define the Prometheus endpoint in the prometheusLocal node.
  • For remote extensions, define the Prometheus endpoint in the prometheusRemote node.

You can define the following endpoint types:

  • Prometheus /metrics endpoint that returns metrics in plain text Prometheus format.
  • Prometheus API /api/v1/ path that could be followed directly by a query or metadata endpoint.
  • Prometheus /api/v1/targets endpoint returning targets of the Prometheus instance. As a result, all of the OneAgent or ActiveGate reachable targets received in the response are scraped in the same way as if these endpoints were passed individually.

If you gather the same metrics from different endpoints (either Prometheus server or data exporter), some metrics could be overwritten, as the keys would be identical regardless of the endpoint. To avoid this, we automatically add an extra activation_endpoint dimension to each metric. For the /api/v1/targets endpoint, the URLs of discovered targets are used instead.

Authentication

Authentication details passed to Dynatrace API when activating monitoring configuration are obfuscated and it's impossible to retrieve them.

The following authentication schemes are supported:

  • No authentication. By default, supported only for HTTP endpoints.
    "endpoints": [
              {
                "url": "http://myPrometheusServer/metrics",
                "authentication": {
                  "scheme": "none"
                }
              }
            ]
    
  • Bearer - requires token.
    "endpoints": [
              {
                "url": "https://myPrometheusServer/metrics",
                "authentication": {
                  "scheme": "bearer",
                  "token": "myToken"
              }
            ]
    
  • Basic - requires username and password.
    "endpoints": [
              {
                "url": "https://myPrometheusServer/metrics",
                "authentication": {
                  "scheme": "basic",
                  "username": "user",
                  "password": "password"
                }
              }
            ]
    
  • AWS - requires AWS access key, secret key, and region.
    "endpoints": [
              {
                "url": "https://myPrometheusServer/metrics",
                "authentication": {
                  "scheme": "aws",
                  "accessKey": "accessKey",
                  "secretKey": "secretKey",
                  "region": "us-east-2"
                }
              }
            ]
    

If you try to use an HTTP endpoint with a bearer, basic, or AWS schema, the extension framework throws an error to avoid sending sensitive data over an unsafe connection. If, however, you're sure you can do so, set the skipVerifyHttps property to true.

"endpoints": [
           {
             "url": "http://myPrometheusServer/metrics",
             "authentication": {
               "scheme": "basic",
               "username": "user",
               "password": "password",
               "skipVerifyHttps": "true"
             }
           }
         ]

Feature sets

Add a list of feature sets you want to monitor. To report all feature sets, add all.

"featureSets": [
  "basic",
  "advanced"
  ]

Advanced

Optionally, you can define advanced settings controlling the HTTP connection to your Prometheus endpoints:

  • timeoutSecs
    An integer between 0 and 50. The number of seconds to wait for a response from the Prometheus endpoint.
  • retries The number of connection retries. The maximum number of retries is 3.

It is possible to have a maximum of 3 connection retries of 50 seconds each.

Note: Make sure the total waiting time is never longer than the interval you set for your metrics.