The Hostgroups tab displays the hostgroups and the patterns that define their members.
Specify the hostgroup properties in the Create Hostgroup or Edit Hostgroup dialog box.
Hostgroups properties — General
The General tab contains basic hostgroup properties:
- Automatically remove offline Hosts after XX hours: after specified period of time the offline host will be removed from the Infrastructure monitoring dashboard. When host appear online again, it will rejoin the group.
Hostgroups properties — Hosts
The Hosts tab contains the host pattern, and the list of the hosts in the group.
To add a host by pattern, type one of the string comparison operators in the Add Hosts by Pattern text box. Matching operators appear in the box as you type, so you can select from the list of matches. Patterns are simple logical expressions that can include the following:
- The string comparison operators: equals, startswith, contains, and endswith.
- The logical operators: not, and, or (in order of precedence). Logical operators can be grouped together with parentheses to change order of precedence. For example:
(contains 'at' or contains 'de') and endswith 'corp'.
- A string in single quotes. Escape any strings containing a single quote with a backslash character, for example
All functions are case-insensitive and require one string literal as parameter. For example:
not contains 'internal'
contains 'at' or contains 'de' and endswith 'corp'
To manually add hosts to the hostgroup, click Add. The pattern text is updated with
or equals '<hostaddress>'.
A host has to belong to one and only one hostgroup. However some hosts may match patterns of several hostgroups. To guarantee a unique match, you can adjust group priority. To do so, right-click the group list, and select Higher Priority or Lower Priority from the context menu.
Hostgroups properties — Thresholds
The Thresholds tab contains thresholds, which determine health criteria for the hosts in the group. All the hosts from the group use these thresholds by default. You can, however, specify host-specific thresholds in the host properties.
If any of the thresholds groups (CPU or memory or network or disk) is violated, the host is considered as unhealthy, and is shown as unhealthy at the Infrastructure monitoring dashboard in the AppMon Client or in the Hosts tile of the Host health web view in AppMon Web. More specifically:
The server evaluates the host health every minute. The CPU, memory and network measurements of the last 15 minutes are split into one minute chunks, by average aggregation.
If 13 chunks violate the relevant criteria, the host is considered as unhealthy. The health is deemed good again when at least three of the chunks are healthy. The healthy chunks don't have to be consequential.
Disks health status changes immediately without a watch period when a threshold is crossed, so the host health displays changes accordingly.
CPU: Any of the values spoils CPU health after the previously mentioned formula.
Memory: The Minimum available subgroup or maximum page faults per second spoil memory health after the previously mentioned formula.
Both minimum available MB and minimum available % of the Minimum Available subgroup must exceed to contribute. So the final logical formula:
('minimum available MB' and 'minimum available %') or 'maximum page faults per second'.
Disk group and subgroup: Both minimum free MB and minimum free % must exceed to trigger, but they trigger immediately.
To ignore one of the two values, combined by the and operator, just set one threshold so it is always resolved as
For example, you can set a very high size or percentage.
Memory available < 100% always returns true, so a violation occurs when
min. available MB < threshold becomes true.
Click Configure Exclusions to exclude specific disks, mount points, or NICs from monitoring. Explicitly include what would otherwise be excluded. See Infrastructure — Exclusions for more information.
Compare measures between operating systems
Measures are relatively comparable between different operating systems, with the following limitations:
- Windows does not deliver a CPU load like *NIX systems. CPU load is omitted in health calculations on Windows hosts.
- Only hard page faults are considered as page faults. Windows systems have hard page faults even with free memory.
- Page faults on AIX versions earlier than 5.2 report soft and hard page faults.
- Disk space available is determined from the point of view of the Agent. If the Agent has restrictions such as disk space quota, only the disk space available to the Agent is reported as free space.
- Page faults require disk access, which includes user rights to memory mapped files. Applications such as backup software use memory mapped files a lot and might cause temporarily high page fault rates.
See What is Monitored on What Platform? for a detailed comparison of operating systems.
Hostgroups properties — Availability
The Availability tab defines the list of processes, monitored on this hostgroup.
- Host Availability: Defines Minimum and Maximum number of hosts in the hostgroup.
- Process Availability: Defines processes for monitoring by the Host Monitoring or technology-specific Agent. Running and Unavailable processes display in the infrastructure overview. Once a process is not available, an incident is created. Use a pattern matching to add the process to the list.
To set a process pattern:
- Click the + button to the right of the process name list.
- Enter a pattern match string in the Pattern column. The process name to match with the pattern includes the process path. The pattern matching supports a * wildcard at the beginning and end. For example,
*w3wp.exe -ap ".NET 4.5"specifies that the host monitors all processes matching the pattern.
- In the Display Name column, type a descriptive name for the process pattern.