Setting up application availability monitoring

What is availability monitoring?

AppMon automatically monitors all JVMs and CLRs that are connected through an Agent. Data appears in the Transaction Flow dashlet, and in the Process Health dashboard.

Process Health dashboard
Process Health dashboard

AppMon can trigger an incident when an application or a host is not available. You can find triggered incidents in the Incidents dashlet.

A process, unavailable for 72 hours is no longer displayed in the Infrastructure monitoring dashboard. It re-appear automatically, once it available again.

You can also use AppMon Synthetic Monitoring to monitor the availability and performance of external websites. See Synthetic Monitoring Integration for more information.

Goal of this tutorial

To explain how to implement availability monitoring with AppMon. Availability monitoring uses a URL or web transaction monitor so that your applications and services perform without a hitch. It also detects components that are disconnected.

Scenario

Use the easyTravel demo application to learn about availability incidents configuration. If a process, a host, or an Agent Group is unavailable, AppMon sends an email to the administrator. The following use cases are covered:

Note

Email service must be set up prior to email notifications configuration. Check the Email tab of the Services item from the Dynatrace Server Settings. See Email to learn how to configure it.

Process availability

AppMon triggers an Application Process Unavailable (unexpected) Severe incident if a JVM or CLR fails. The Agent Availability icon then turns red. This incident is pre-configured, and exist in every newly created System Profile. You don't have to create in manually.

If a JVM or CLR fails, a notification is sent to the Incident Email Group. To configure the email, go to System Profile > Incidents tab. Edit the Application Process Unavailable (unexpected) setting.

  1. Open the System Profile Preferences dialog box and click Incidents.
  2. Double click Application Process Unavailable (unexpected) incident rule to edit it.
  3. If needed, click Basic Configuration to switch the mode of the Actions tab. It allows you to specify the email recipients easier.
  4. Select the Send notification upon Incident Rule violation checkbox.
  5. In the Email field, type the recipient email address and click +. Press Ctrl+Space to view suggestions list.
    Here you can specify a user group as a recipient. All members of the group will receive notification at the email address, specified in the account.
  6. If needed, you can configure addition email settings, such as sender address, copy receivers, and subject.
    Click Advanced Configuration to switch the mode of the Actions tab, then double click Email notification action to edit it.
  7. To test the notifications, stop easyTravel or one of its subsystems to receive the email notification.

Host unavailable

The built-in URL Monitor allows you to monitor the availability of the applications URL. You can create an incident based on the HostReachable measure to get an email notification if the URL is unavailable.

1. Create URL Monitor

First, you need to create a URL Monitor. If you already have one, skip this part.

  1. Open the System Profile Preferences dialog box and click Monitors.
  2. Click Create.
  3. In the Monitor Type Chooser dialog, select URL Monitor and click OK.
  4. Give a meaningful name to your monitor. In this tutorial it's Sample URL Monitor. Review the parameters of the checking request, if needed, change them.
  5. Now add hosts to be monitored. In the Hosts pane, click +.
  6. In the Select Host dialog box, select the required hosts, and click OK.
  7. Save all your changes to create the monitor.

2. Create incident rule

Now you need to create an incident rule, based on the HostReachable measure of your URL monitor. If the measure is 0, than the host is unreachable.

  1. Open the System Profile Preferences dialog box and click Incidents.
  2. Click Create Incident Rule.
  3. Give a meaningful name to the incident.
  4. Configure the condition of the incident:
    1. At the Conditions tab, click Add.
    2. Find the HostReachable measure of your URL monitor: System Monitoring > URL Monitor > Sample URL Monitor > URL Monitor > HostReachable. You can also press Ctrl+F and search for the measure.
    3. Double click the measure to edit it.
    4. Set the Lower Severe threshold to 0.
    5. Save the changes to the measure, and then click Add to create a condition for the incident rule.
    6. Make sure that the Threshold is set as Severe for this condition.
  5. Configure email notification, as described in the Unavailability of agents email alert section above.

Agent Group availability

Here you need to create incidents, triggering when the number of Agents, connected to a certain group is lower than specified. It is possible by using several instances of the Connected Agents measure. Each instance is subscribed to one Agent Group. It allows to have different conditions and actions. For example, an email should be sent to different people for frontend and backend incidents; also when the backend is down to one server, this is a severe incident, yet when the frontend is down to two, this is severe.

Here are the steps for creating the measure for the Customer Web Frontend Agent Group of the easyTravel System Profile.

  1. Open the System Profile Preferences dialog box and click Incidents.
  2. Click Create Incident Rule.
  3. Give a meaningful name to the incident.
  4. Configure the condition of the incident:
    1. At the Conditions tab, click Add.
    2. Click Create Measure.
    3. Find the Connected Agents measure: Server Side Performance > Functional Health > Availability > Connected Agents. You can also press Ctrl+F and search for the measure.
    4. Give a meaningful name to the incident.
    5. Set the Lower Severe threshold to 2.
    6. At the Details tab, restrict the measure to the Customer Web Frontend Agent Group. Select the calculate only for selected agent groups radiobutton, and then select the group in the list.
    7. Click Add to create the measure and close the dialog box.
    8. Select the newly created measure and then click Add to create a condition for the incident rule.
    9. Make sure that the Threshold is set as Severe for this condition.
  5. Configure email notification for the person, responsible for frontend server, as described in the Unavailability of agents email alert section above.
  6. Repeat the same steps for the Business Backend Server Agent Group. The difference is threshold of 1, and receiver of the notification is the person, responsible for backend server.