Infrastructure

Overview

In the AppMon Client, choose Settings > Dynatrace Server > Infrastructure to configure and manage sites, hostgroups, hosts, and labels applicable to hosts and alert configurations and their exceptions.

Hosts tab

In the AppMon Client, choose Settings > Dynatrace Server > Infrastructure > Hosts to see all connected Agents, Collectors and AppMon Server hosts. Each entry represents a single host on the network. A host belongs to exactly one detected or selected site, and exactly one hostgroup (address matches the defined pattern). See Troubleshooting > Hostgroup Priority for details on guaranteed pattern uniqueness.

If no Agent is connected to the backend server process, the host is determined as unavailable. A host can have any number of labels for flexible categorization.

Click Create to manually create a new host using the Create / Edit Host dialog box. Double-clock a host in the lists, or click the host listing then click Edit to changes settings for an existing host.

Create / Edit Host dialog box

Auto-detection typically creates all needed hosts. Create a host entry manually only in very special cases, such as:

  • When there is no Agent active on the host.
  • You have the ping plug-in installed and you want to define an host incident rule to get an alert when the host can’t be pinged.

Edit a host entry to:

  • Update the name that was automatically derived from the address.
  • Change the host’s site.
  • Add a label.
  • Configure specific thresholds availability for this host.
  • Manually create a host when needed.

The host uses the assigned hostgroup thresholds by default.
Clear the Use Thresholds of the Selected Hostgroup check box in the Thresholds Tab to use host-specific thresholds.

Tip

If using host-specific thresholds, consider displaying the Use Own Thresholds column on the Hosts tab. To do this, right-click on the column headers and select it from the list.

See Hostgroup Dialog Box - Thresholds Tab and Hostgroup Dialog Box - Availability Tab for more information on the Thresholds and Availability tabs.

Hostgroups tab

Choose Settings > Dynatrace Server > Infrastructure > Hostgroups to see the custom hostgroups and the patterns that define their members. Use this tab to access the Add / Edit Hostgroup dialog box to create, edit, and configure hostgroups.

Add/Edit Hostgroup dialog box

The Add/Edit Hostgroup dialog box includes tabs that let you:

  • Set the pattern to match hosts into a group by comparing their address to the pattern.
  • Configure group-specific thresholds.
  • Set number of hours from the time the host is offline until removal from the infrastructure overview. 
  • Define host availability and processes monitored by the host agent.

Tip

Click the Add button to add hosts manually to the hostgroup. The pattern text is updated with equals 'hostaddress'.

Hostgroup Dialog Box - Hosts Tab - Pattern Definition

To add a host by pattern, type one of the string comparison operators in the Add Hosts by Pattern text box. Matching operators appear in the box as you type, so you can select from the list of matches. Patterns are simple logical expressions that can include the following:

  • One of the string comparison operators: equalsstartswithcontains and endswith.
  • A string in single quotes. Escape any strings containing a single quote with a backslash character, for example 'isn\'t'.
  • The logical operators: notandor (in order of precedence) can be grouped together with parentheses to change order of precedence. For example: (contains 'at' or contains 'de') and endswith 'corp'.

All functions are case-insensitive and require one string literal as parameter. For example:

  • equals ‘34325.clients.emea.dynatrace.corp’
  • startswith ‘lnz’
  • contains ‘emea’
  • endswith ‘.com’
  • not contains ‘internal’
  • contains ‘at’ or contains ‘de’, and endswith ‘corp’

Hostgroup Dialog Box - Thresholds Tab

Set thresholds on this tab to determine exactly when the hosts in the selected hostgroup show as unhealthy in the Infrastructure Overview dashboard or in the Hosts tile of the Operations dashboard in AppMon Web.

Generally, if any of the four groups (CPU or memory or network or disk) is unhealthy, the whole host is considered unhealthy. More specifically:

  • The server evaluates the host health every minute. The CPU, memory and network measurements of the last 15 minutes are split into one minute chunks, by average aggregation.
    If 13 chunks violate the relevant criteria, the host is considered as unhealthy. The health is deemed good again when at least three of the chunks are healthy. The healthy chunks don’t have to be consequential.
  • Disks health status changes immediately without a watch period when a threshold is crossed, so the host health displays changes accordingly.

Threshold groups and subgroups
  • CPU group:  Any of the three values (usage % or system % or load) spoils CPU health after the above mentioned formula.
  • Memory group:  Minimum available subgroup or maximum page faults per second spoil memory health after the above mentioned formula.
    • Both minimum available MB and minimum available % of the Minimum Available subgroup must exceed to contribute.
    • In all: (minimum available MB and minimum available %) or minimum page faults per second.
    • To ignore one of two anded values, just set one threshold so it is always reached. For example, you can set a very high size or percentage. Memory available < 100% returns true, so a violation occurs when min. available MB < threshold becomes true.
  • Disk group and subgroup: Both minimum free MB and minimum free % must exceed to trigger, but they trigger immediately. To ignore one of two anded values, just set one threshold so it is always reached.

Click Configure Exclusions to exclude specific disks, mount points, or NICs from monitoring. Explicitly include what would otherwise be excluded. See Exclusions Tab for more information.

For the Add/Edit Host dialog box, select the Use Thresholds of the Selected Hostgroup check box in the Thresholds Tab to use host-specific thresholds.

Tip

The host uses the thresholds of the assigned hostgroup by default. You can override hostgroup-wide thresholds and set host-specific ones in the Thresholds tab.

Compare measures between operating systems

Measures are relatively comparable between different operating systems, with the following limitations:

  • Windows does not deliver a CPU load like *NIX systems. CPU load is omitted in health calculations on Windows hosts.
  • Only hard page faults are considered as page faults. Windows systems have hard page faults even with free memory.
  • Page faults on AIX versions earlier than 5.2 report soft and hard page faults.
  • Disk space available is determined from the point of view of the Agent. If the Agent has restrictions such as disk space quota, only the disk space available to the Agent is reported as free space.
  • Page faults require disk access, which includes access to memory mapped files. Applications such as backup software use memory mapped files a lot and might cause temporarily high page fault rates.

See Host Health Monitoring > What is Monitored on What Platform? for a detailed comparison of operating systems.

Hostgroup Dialog Box - Availability Tab

Host Availablity: Define Minimum and Maximum thresholds of the number of required hosts in the hostgroup.   Process Availability: Specify processes for monitoring by the Host Monitoring Agent. Running and Unavailable processes display in the infrastructure overview. Once a process is not available, an incident is created. For the Add/Edit Host dialog box, you can select the check box to use the process patterns of the selected hostgroup, then specify the processes using the Add / Edit Hostgroup dialog box. 

You can monitor processes by using pattern matching. Click the button to the right of the process name list, then click the top available line in the Pattern column to enter a pattern match string. The process name to match with the pattern includes the process path. The pattern matching supports a * wildcard at the beginning and end. For example, w3wp.exe -ap “.NET 4.5”. This specifies that the host monitors all processes matching the pattern. Click in the Display Name column to add a descriptive name for the process pattern match.

Labels tab

The Labels tab lists default and custom labels to logically group hosts when you placing the same host into more than one group. This is diametrically opposed to patterns which, should uniquely group hosts into hostgroups and are guaranteed to do so with the help of the Priority property.

Tip

Default labels are indicated with the Dynatrace logo.

Click Create or Edit to open the Add/Edit Label dialog box to create and edit any type of labels that fit your purpose. For example, create labels for specific operating systems, or separate labels for development, testing or production. If you create a Monitor for a label, AppMon automatically executes it for equally labeled hosts. You are not able to edit the default labels.

Sites tab

The Sites tab displays default and custom site names used to assign hosts to a specific location. A site consists of a name and description.

Click Create or Edit to open the Add/Edit Site dialog box, where you can specify a site for a host. For example, you can configure sites for different departments or cities. Each host has exactly one site. As with labels, you are not able to edit the default site.

Alerts tab

This tab lists the default and custom alerts created to send notifications when alerts occur. After configuring thresholds for the hosts, create alerts to define whom to notify when the thresholds are exceeded and how to act on violations.

Click Create or Edit to open the Add/Edit Alert dialog box. To create a custom alert, provide the following.

  • Name: Give the alert a unique name that describes its purpose.
  • For list box: Select the category (Host, Hostgroup, Label, Site) by which the actions should trigger.
  • System Profile check box (optional): Filter the hosts further by the selected System Profile.
  • CPU, Disk, Memory, Network: Check the threshold groups that should trigger.
  • Execute action if first host changes one of the following health states: Every time one threshold of any affected host is exceeded, the actions are executed. Select check boxes to indicate in which areas the actions are executed when the first host becomes unhealthy.
  • Action: Configuration of the actions for this alert. See Incidents and Alerting for more information about actions.

Exclusions tab

The Exclusions tab lists specific disks and mount points than can be excluded or explicitly included for display on the Infrastructure Overview dashboard. Click Create or Edit to open the Add/Edit Alert dialog box and exclude specific disks and mount points or NICs from monitoring. Explicitly include what would otherwise be excluded. See the dialog box or the screenshot below, for predefined, standard *NIX exclusions or inclusions that may fit your needs.

A disk, mount point or NIC is first matched against inclusions. If it matches, it is not excluded. If it does not match, it is matched against exclusions. If it matches, it is excluded. All outstanding non-matches are included. Specify the following to configure disk exclusions.

  • Name: Give the inclusionor exclusion, or ignore a unique descriptive name.
  • Handling types: The disk, mount point or NIC. Select one of the following: 
    • Exclude: Disk is not visible in the Infrastructure Overview.
    • Include: Is not excluded.
    • Ignore: Disk is visible, but does not trigger any action at the defined threshold.
  • Mount Point: The metrics identifier, for example /tmp on *NIX or c:\ on Windows.
    Show *NIX mount points with cat /proc/mount.
    The asterisk wildcard is allowed, but only at the end. For example:
    • /h* means all mount points starting with /h.
    • /home/* works, but only if you have mount points in the home directory.
    • /home* to catch the whole /home mount point.
    • /tmp* should exclude /tmp/foo.
    • /tmp/* does not exclude /tmp.
  • Operating System: The host operating system.
  • File System: The host file system. You can add a new file system to the drop down list or group them by using the asterisk wildcard at the end. For example, ext* indicates all EXT versions.
  • Hostgroup, Label: These two can only be selected if the Handling setting is ignore.
  • System Profile: Filter the hosts further by the selected System Profile.

Configuring NIC exclusions

NIC exclusion settings are generally the same as disk settings. Specific is only the name of the network interface.

Troubleshooting

Hostgroup priority

A host has to belong to one and only one hostgroup If defined patterns let a host accidentally match into more than one hostgroup there, use the hostgroup Priority property to guarantee a unique match. When you create a hostgroup, a hidden Priority property gets assigned. See Hostgroup Dialog - Hosts Tab - Pattern Definition) for more information.

You can uncover the priority property OR column by right-clicking the column header on the Hostgroups tab and selecting Priority. Change a hostgroup’s priority by right-clicking on its row and selecting Higher or Lower Priority.