Create highly available AppMon installation

The AppMon Server has built in capabilities to detect problems with the server process. When a watchdog detects such problems, it restarts the server process automatically. You can extend this to protect your AppMon installation from hardware faults by using some external tools.

The described setup is the recommended and tested way how to achieve high availability for the AppMon Server.

Note

High availability of the configured session storage and Performance Warehouse is outside the scope of this guide. It is up to the user to make sure the database is highly available.

Goal

The goal of this setup is to automatically failover if the host, where the AppMon Server is running, goes down for any reason. So we want to have:

  • Automatic recovery
  • Downtime smaller than hardware problem repair.

Downtime

Note, that we're not pursuing zero downtime (i.e. zero lost transactions) during the failover. The second AppMon Server is serving as a cold standby server, so there will be a downtime until the new server can take over. This downtime is about 60 seconds to detect that the Server is not available anymore and the startup time of the backup AppMon server.

Prerequisites

This setup requires:

  • 2 Linux hosts
  • 3 static IP addresses. However, to avoid split brain, it's better to have 5.
  • 1 Network File System (NFS) share on a Storage Area Network (SAN)

Overview

The image below shows the scheme of the setup.

Highly available installation overview
Highly available installation overview

The general idea is to use 2 hosts that are capable to start a shared AppMon installation that lives on a SAN. If one host goes down, the other host takes over and starts the AppMon server. The traffic from Collectors has to be redirected to the new host.

To achieve it we install heartbeat, which is a Linux daemon, on both hosts. It monitors availability of the hosts and if necessary selects a host to be the master. The master host obtains the cluster IP. All the data is sent to this IP. The master host will mount the NFS share with the AppMon installation and then start the AppMon Server from there.

To avoid split brain situations, when both hosts think they are the master because the daemons cannot reach the other host, we highly recommend to use hosts with two NICs and connect one of those NICs directly to the other host. It allows the heartbeat to detect availability of the other host even in the presence of network problems.

Procedure

This procedure uses names and IPs from the image above. Replace them accordingly in your installation.

1. Prepare environment

  1. Get two Linux hosts. We will refer to them as host1 and host2 further in this guide.
  2. Assign two static IPs for the main network:
    • 192.168.1.11 to host1.
    • 192.168.1.12 to host2.
  3. Optional Assign two static IPs for secondary connection, preventing split brain. Do not use virtual network, it doesn't work with heartbeat.
    • 192.168.2.11 to host1.
    • 192.168.2.12 to host2.
  4. Reserve the static IP for the cluster: 192.168.1.10. This IP is always assigned to the master host.
  5. Create a DNS entry for the cluster IP address (192.168.1.10). Use this IP as the server hostname when connecting Collectors or AppMon Clients.
  6. Make sure that a NFS share on a SAN is available. We refer to it as STORAGE_URL further in this guide.
  7. Get two AppMon licenses. You need separate license for each host. You can get an additional license from the Dynatrace support for the usage in an high availability setup.

2. Install the AppMon Server on the SAN

Here you can use one of the hosts and temporarily mount the STORAGE_URL to it. We refer to this mount location as APPMON_DIR. The actual directory can be, for example /srv/appmon.

  1. Install the AppMon Server to the APPMON_DIR. See Install the AppMon Server to learn how to do it.
  2. Add license for each host:
    1. Copy two license files to the APPMON_DIR/server/config directory on the NFS share.
    2. Rename them as dtlicense.lic.host1 and dtlicense.lic.host2.

3. Install heartbeat

Install it on both host1 and host2. You can use the following command:

sudo apt-get install heartbeat

The heartbeat installation directory, for example /etc/heartbeat, is referred as HEARTBEAT_INSTALL further in this guide.

Important

In the following steps 4 to 7 we will configure heartbeat on both hosts. All configurations files have to be identical on both hosts! So simply configure the files on one host, and then copy them to the second host.

4. Copy the configuration file

The ha.cf file contains configuration, specifying how to reach the other cluster nodes, ping frequencies and so on.

  1. Download the ha.cf file, and place it to the HEARTBEAT_INSTALL directory.
  2. In the file, adapt host names and IP addresses. Do not modify any other settings.

5. Copy the haresources file

The haresources file specifies what to when the master role has changed. It uses the different scripts from the resource.d directory and calls them with some arguments. It usually does the following here:

  • File system: Mounts NFS share to the APPMON_DIR.
  • IPaddr2: Applies the cluster IP address to an Ethernet device.
  • MailTo: Sends an email when the host is elected to be the master and when a host is elected to be a slave.
  • appmon: Calls the script to run the AppMon from the NFS share. You will configure this script in the next step.
  1. Download the haresources file, and place it to the HEARTBEAT_INSTALL directory.
  2. In the file, adapt the hostname (use the local hostname), file system mount, cluster IP address, and mail configuration.

6. Copy the AppMon startup script

The script starts/stops the AppMon instance when the master host changes.

  1. Download the appmon file, and place it to the HEARTBEAT_INSTALL directory.
  2. In the file, adapt the location of the AppMon installation.
  3. Make sure the script has execute permissions.

7. Create the authentication key

The authentication key file allows cluster members to join a cluster. Random keys can be generated.

  1. Create a new random key:
    dd if='/dev/urandom' bs=512 count=1 2>'/dev/null' | openssl sha1 | cut --delimiter=' ' --fields=2
    
  2. Create the /etc/ha.d/authkeys file. Copy the following to the file:
    auth1
    1 sha1 <key>
    
  3. Replace the <key> with the output of the step 1.
  4. Set up permissions:
    sudo chmod 600 /etc/ha.d/authkeys
    
Important

Make sure to copy the ha.cf, haresources, appmon, and authkeys files to the appropriate locations on the second host!

8. Set up license at the host1

Execute the following command:

sudo /usr/share/heartbeat/hb_takeover

It will execute things specified in the haresources file. You will get a mount point, the cluster IP will be assigned, and the mail notifications will be sent. The attempt to execute the appmon script will, however, fail, because we don't have AppMon installation yet.

Run the AppMon Client.
Copy the activated license from the APPMON_INSTALLATION/server/config/dtlicense.lic file to the APPMON_INSTALLATION/server/config/dtlicense.lic.host1 file on the host1.

9. Configure the external host

In the AppMon Client, click Settings > Dynatrace Server > Services > Management. In the External host name" field, set the DNS entry of the cluster IP.

It guarantees that all services of this server are reachable from other hosts by using this host name.

10. Set up license at the host2

  1. Make the host2 master, by executing the following command:
    sudo /usr/share/heartbeat/hb_takeover
    
  2. Wait for the AppMon Client to reconnect.
  3. Activate the second license.
  4. Copy the activated license from the APPMON_INSTALLATION/server/config/dtlicense.lic file to the APPMON_INSTALLATION/server/config/dtlicense.lic.host2 file on the host2.

Done

You all set now. The host2 is the master host. If it goes down for any reason, the host1 will take over.