Dynatrace root-cause analysis relies on AI-driven event- and data-correlation to collect and analyze lots of information about your environment, including all transactions, events, metrics, and topology. Dynatrace uses all this information to quickly pinpoint the root causes of abnormal situations. As of today, Dynatrace correlates hundreds of different types of built-in events, including CPU saturation, Response time degradation, and Error rate increase. By writing and deploying your own monitoring plugins, or by using the Dynatrace API, it’s now possible for you to add new types of events to Dynatrace event correlation. A new Dynatrace API endpoint now provides the ability to define new custom events and to edit existing event types.

New event types for custom plugins

Dynatrace custom plugin developers can now define new event types by adding an alert_settings section within one or all metrics definitions. Within alert_settings (see example below), plugin developers can define one or more events with specific thresholds and sliding window sizes. Placeholders within event descriptions (for example, {severity} and {threshold}) are used to automatically include key variables in event descriptions.

See the example below that shows how to define a custom event on the JMX metric Message count:

{
    "version": "1.0",
    "name": "custom.jmx.hornetq",
    "type": "JMX",
    "processTypes": [
       10, 12, 13, 16, 17, 18
    ], 
    "configUI" : {
        "displayName": "HornetQ JMX Wolfgang"
    },
    "entity": "PROCESS_GROUP_INSTANCE",
    "metrics": [
       {
          "timeseries": {
              "key": "Queue.MessageCount",
              "displayname": "Message count",
              "unit": "Count",
              "dimensions": [
                  "rx_pid",
                  "name"
              ]
           },
          "source": {
              "domain": "org.hornetq",
              "keyProperties": {
                  "module": "JMS",
                  "type": "Queue",
                  "name": "*"
              },
              "allowAdditionalKeys": false,
              "attribute": "MessageCount",
              "calculateDelta": false,
              "calculateRate": false,
              "aggregation": "MAX",
              "splitting": {
                  "name": "name",
                  "type": "keyProperty",
                  "keyProperty": "name"
              }
         },
         "alert_settings": [
             {
                 "alert_id": "jmx_alert",
                 "event_type": "ERROR_EVENT",
                 "event_name": "Low message count",
                 "description": "Actual number of {severity} queue messages is {alert_condition} the critical threshold of {threshold}",
                 "threshold": 100.0,
                 "alert_condition": "BELOW",
                 "samples":5,
                 "violating_samples":3,
                 "dealerting_samples":5
             }
        ]
        }
        ],
     "ui": {
         "keycharts" : [
             {
                 "group": "HornetQ",
                 "title": "Queue depth and ingress",
                 "series": [
                     {
                         "key": "Queue.MessageCount",
                         "aggregation": "avg",
                         "mergeaggregation": "sum",
                         "displayname": "Average queue depth",
                         "seriestype": "area"
                     }
                 ]
            }
       ],
     "charts": []
     }
}

If you want to change the threshold of a custom plugin event, you can find all the custom event parameters listed at Settings > Anomaly detection > Plugin events (see below).

Once a custom plugin is successfully deployed, a custom event will be automatically raised if the actual value of the metric Message count falls below the threshold of 100 within 3 of 5 1-minute sample periods.

When a new problem is raised based on this custom plugin event, the Problems feed displays the new problem with the custom event name Low message count and the configured placeholder values filled in the description text (see example below).

What makes the concept of custom events even more powerful is the fact that you can now use the Dynatrace REST API to modify your thresholds as well as any of the events attributes. API consumers can create new event types or modify existing event types using the API.

The Dynatrace REST API call below reads all defined custom events within an environment:

https://<ENVIRONMENT>.live.dynatrace.com/api/v1/thresholds/?Api-Token=<TOKEN>

The API call lists all configured plugin and custom thresholds, so you can easily identify the new event thresholds that are introduced from the example above:

An HTTP PUT request to the threshold identifier is used to either modify existing thresholds or to create new thresholds (as shown below).

HTTP PUT https:// <ENVIRONMENT_ID>.dynatrace.com/api/v1/thresholds/custom.jmx.hornetq:Queue.MessageCount:jmx_alert?Api-Token=<TOKEN>

Here’s the new payload:

{
    "timeseriesId": "custom.jmx.hornetq:Queue.MessageCount",
    "threshold": 99,
    "alertCondition": "BELOW",
    "samples": 5,
    "violatingSamples": 3,
    "dealertingSamples": 5,
    "eventType": "ERROR_EVENT",
    "eventName": "Low message count",
    "filter": "PLUGIN",
    "description": "Actual number of {severity} queue messages is {alert_condition} the critical threshold of {threshold}"
}

The thresholds API is also used to define events and thresholds for custom network devices. The HTTP PUT request is the same as shown above, however you must choose one of your own custom metrics instead of the JMX metric.

The newly introduced thresholds API enables a lot of powerful use-cases, including defining new events on demand using the API or automatically changing thresholds supplied by 3rd-party systems. This enables your DevOps teams to define new custom metric-based events using automated scripts and to further embed Dynatrace monitoring into your IT automation environments.