Middleware monitoring - IBM WebSphere MQ

In the IBM WebSphere MQ section you set parameters specific to monitoring IBM WebSphere MQ-based software services.

General

Global properties for IBM MQ analyzer include timeouts and limiting of reported operations.

For any given NAM Probe, you can change global settings for the IBM WebSphere MQ protocol so that the settings are inherited by all user-defined software services.

To modify global settings for IBM WebSphere MQ monitoring:

  1. Modify global settings for monitoring of IBM WebSphere MQ.

    Operation load time threshold
    The number of seconds after which an operation is considered “slow”. You can set this value with a precision of one ten-thousandth of a second.

    Flow timeout
    Maximum idle time for operation assembler. Maximum number of seconds the system will wait for the operation is considered complete.

    Flow element limit
    Maximum number of messages for a single operation. Each operation contains a number of messages, this options sets a maximum number of elements that can be observed in a single operation.

  2. Fine-tune the reporting of availability, failures and data mining.

Availability

By configuring the availability, you can determine which attempt failures are included in the availability metric calculation.

You can configure BM WebSphere MQ availability globally or at the software service level.

For global configuration, open the NAM Probe configuration and go to Global ► Middleware Monitoring ► BM WebSphere MQ ► Availability. For the software service level, select the Availability tab in the Edit Rule window.

Failures (transport)

For IBM WebSphere MQ, you can determine whether the following errors, all disabled by default, should be included in the calculation of Failures (transport) metric.

  • No response
  • MQ client errors
  • MQ server errors
  • MQ security errors
  • MQ protocol errors

Note that all of them are configurable.

Failures (application)

You can decide whether each of five MQ application errors should be reported as failures (application).

Fault domain isolation

Thresholds

Use the following threshold settings to accurately identify the true source of the problem:

Server time threshold
The Server time threshold relates to the server time portion of an overall operation time. Server times above the threshold limit are considered to be slow due to poor datacenter performance.

Idle time threshold
Threshold for the time during the operation execution when there is no network or server activity related to the operation. It is assumed that Idle time is caused by the user's software not sending requests because user's PC is busy.

Network time threshold
Threshold for the time the network (between the user and the server) takes to deliver requests to the server and to deliver page information back to the user. In other words, Network time is the portion of transaction time that is due to the delivery time on the network.

Retransmissions threshold
Percentage of retransmissions regarding all observed transmissions.

Network time affected by high retransmission threshold
Percentage of the network time affected by high retransmission threshold.

Request size threshold
Threshold for the request where anything larger would be considered big request.

Network time affected by the transfer of a big request threshold
Threshold for the request where anything larger would be considered big request.

Response size threshold
Threshold for the response where anything larger would be considered big response.

Network time affected by the transfer of a big response threshold
Threshold for the network time that is affected by the transfer of a big response threshold.

Number of hits threshold
Threshold for the number subcomponents of error-free operations or transactions.

Single hit duration threshold
Threshold for a hit duration as a percentage of operation time

Rtt threshold
Threshold for the time it takes for a SYN packet to travel from the client to a monitored server and back again.