Create and Configure incident rules

An incident rule in AppMon is a mapping between measured thresholds and the actions to be taken when such thresholds are violated. If all the thresholds defined in the incident rule are violated, this is an incident, even if no actions have been configured for the incident rule.

You need to configure the following mandatory settings for an incident rule to work:

Additionally, you can configure:

  • Actions: Automatic actions to be taken when the rule triggers.
  • Link a dashboard to quickly access it when the rule triggers.

To create a new incident rule, click Create Incident Rule in the Incidents item of the System Profile Preferences dialog box.

To edit an existing rule, double-click in the same item. You can also right-click the required rule in the Incidents Overview dashlet, and select Edit incident rule from the context menu.

The procedure for editing an existing rule is generally the same as creating a new rule. However, you can't edit conditions for some of the built-in incidents.

Incident rule configuration
Incident rule configuration

General settings

  • Name: Give the rule a unique name that describes its purpose.
  • Description: Optionally, provide a description of the rule. The description can include the measures associated with it and the action that occurs if the rule is violated.
  • Evaluation Timeframe: Select the duration used to evaluate if the defined conditions are met.

    For example, if you select a timeframe of one minute and use a PurePath duration measure with an average aggregation as input for the incident rule, the average PurePath duration of the last minute is checked for violation every 10 seconds. Measures remain in memory for one hour.

    The following graphic illustrates how timeframe is used to check for incidents.

    Incidents close only when their condition hasn’t been met for a minute. This more effectively deals with measurements oscillating around their thresholds. Once this timespan has passed without further violation, incidents are closed with the closing time set to fit actual measurements.

  • Incident Severity: Choose the severity of incidents to show in incident-related dashlets. Incident severities are Informational, Warning, or Severe. Select the level that requires a response to notifications.
  • Incident Suppression: Set the incident delay in seconds. Incidents are suppressed during the configured number of seconds after an occurrence, to avoid sending redundant notifications.
  • Store incidents in Performance Warehouse: Set by default, this setting guarantees a complete incident history for the available charting data time frame. Disable this option if an incident should trigger actions without a historical record. All configured actions are still performed. Closed incidents are deleted no later than the next AppMon Server restart.

Conditions

A condition for an incident rule is a measure with defined Warning and/or Severe thresholds. Each rule must have at least one condition. If the incident rule has several conditions, you must define logic to concatenate them.

Whenever the specified measures exceed the threshold, the incident triggers, and its actions, if any, execute.

Conditions table
Conditions table

Click Add to select a measure for the condition.
Click Edit to change the configurable properties of a selected measure.
Click Remove to remove a selected measure for a condition from the list.

Each measure used for a condition displays by name in the conditions list and includes the following details:

  • Agent Group or Monitor: The Agent Group or the monitor for which the measure was configured.
  • Threshold: The threshold type for a condition that triggers an incident. Types include no threshold, warning or severe, and severe.
    • no threshold: Condition is ignored when the incident rule is evaluated.
    • warning or severe: A warning triggers when the warning threshold is exceeded. No additional incident occurs if the severe threshold is exceeded.
      If you want an incident to be thrown if a severe condition occurs after a warning condition, you need to define a separate incident rule with the severe threshold.
    • severe: An incident triggers when the severe threshold is exceeded.
  • Aggregation: The aggregation method for measure values for the evaluation timeframe: avg (average), count, last, max (maximum), min (minimum), sum, or first.
    A measure can also occur multiple times per PurePath.
  • Logic: The logical operator to combine multiple conditions: and, or. You can select an operator only if two or more measures are listed.

    Multiple conditions can be concatenated logically. When the condition is evaluated, no operator precedence (AND stronger than OR) is applied:

    • If the first FALSE condition is followed by an AND concatenation, then the complete expression evaluates to FALSE.
    • If the first FALSE condition occurs after an AND concatenation, then the complete expression evaluates to FALSE.
    • If the first TRUE condition is followed by an OR concatenation, then the complete expression evaluates to TRUE.

    For example:

    • true AND false OR true — evaluates to false
    • true OR false AND true — evaluates to true
    • true OR false AND false — evaluates to true
    • false AND true — evaluates to false

If you change thresholds in a measure, it will affect all incident rules which are using this measure. When you need the same measure with different thresholds, create a copy of it, and set the new thresholds there.

Actions

You can add automatic actions to be performed on the incident:

The Actions tab has two modes: basic, where you can only configure email notification, and advanced, where you can configure any action. Click Advanced Configuration or Basic Configuration to switch between them.

A message displays if the general AppMon email configuration is missing. If this happens, you can configure email by clicking Yes in the message box. The Email tab of Services item of the Dynatrace Server Settings dialog box opens.

Basic configuration

Actions - Basic configuration
Actions - Basic configuration

For basic configuration, you can only configure email notification for incidents. The Default From user of the Email settings is the sender. You can change is in the Advanced Settings mode.

  • Send notification upon Incident Rule violation: Select to activate email notifications.
  • Email: Type in email recipients. You can use the actual email address, or AppMon user names and user groups. In that case, AppMon sends the notification to the email address from the user account.
    The list of recipients appears in the Email Recipients field.
  • Smart Alerting: Sends email just once if multiple incidents occur, until the raised incident is confirmed by a user. Otherwise notification goes out for each incident.

Linked dashboard

You can link a dashboard to the incident rule. This dashboard is used for reporting the incident by email. You can also quickly navigate to the dashboard from the Incidents dashlet, via context menu of the incident. You can only select dashboards stored on the server where the System Profile is stored. The default for new incident rules is the Incident Dashboard, which is deployed with every AppMon Server.

Linked dashboard
Linked dashboard

For built-in baseline incident rules, the linked dashboard configuration includes an option to open the affected splitting in the Applications dashboard.

For built-in host incidents, the configuration includes an option to open the Infrastructure monitoring dashboard for the affected host.

Rule creation example