Calculating availability

Availability is measured and presented as the percentage of successful attempts (operations) compared to all attempts.

Formula

The availability metric is calculated as the percentage of successful attempts:

 Availability = 100% * (All Attempts - All failures) / All Attempts

where:

  • All attempts = all failures + all successful operations + all standalone hits not classified as a failure + all aborts not classified as a failure
  • All failures = all failures (transport) + all failures (TCP) + all failures (application)

Each attempt is classified as one of: operation, standalone hit, abort, failure (TCP), failure (transport) or failure (application) and the classification depends on the configuration.

An attempt may fall into more than one category (for example, a TCP failure and an application failure), but it is counted in only one category using the following priority:

  1. Failure (TCP)
  2. Failure (transport)
  3. Failure (application)
  4. Abort
  5. Operation

Availability is calculated for all the analyzers capable of reporting operations.

Operation

An operation counts only if it is a successful operation. The operation count does not include failures, aborts, and standalone hits.

Abort

An operation manually aborted by a user (for example, by clicking the browser's Stop button). Note that you may classify an abort as a failure (transport) using the availability configuration, in which case it is no longer part of the aborts count.

Standalone hit

An incomplete response (a hit that is not included in any reported operation). An incomplete response classified as a Failure (transport) is not included in the standalone hits count.

Failure (TCP)

An operation that failed due to a TCP error. Failures (TCP) have the highest priority.

Failure (transport)

A Failure (transport) relates to problems occurring in the transport layer of a protocol monitored by the NAM Probe:

  • Errors in the transport layer.

  • SSL alerts classified as a failure. SSL errors are also treated as transport failures. You can specify which SSL alert codes should be classified as availability problems, separately per alerts sent by server and client. For more information, see Advanced - SSL options.

  • Aborts classified as a failure in the configuration.

  • Incomplete responses classified as a failure in the configuration.

The priority of transport failures is lower then TCP failures, which means that the failures (transport) metric will not take into account any operation which was reported as TCP failure even if an error in transport occurred.

The configuration enables you to decide which type of error, incomplete response or abort should be taken into account when calculating availability. Additionally, you can limit the failure reporting to specific conditions. The set of error types available for failure reporting depends on an analyzer.

Failure (application)

A Failure (application) relates to problems occurring in the application layer. You can select which operation attributes should be included as an application problem.

The priority of application failures is lower then transport failures, which means that the failures (application) metric will not take into account any operation which was reported as application failure even if the application error occurred. Some analyzers are preconfigured to detect typical application problems.

Failures (application) is available only for analyzers capable of detecting operation attributes.

Reporting availability

The key availability metric is used both on the Data Center Analysis Reports, EUE Overview and Software Services reports. It is accompanied by a breakdown into transport, TCP and application context.

Availability tooltip
Availability tooltip

Note that introduction of the new availability affects most data reported by the NAM Server and ADS. Unlike in previous releases, the operations count does include failures. Consequently, all the metrics calculated using the operations counter may report different values. This includes Operation Time, as failures are not included in this calculation. Operation time in environments recording many failures may increase after upgrade.

Errors metric is replaced by "Application Responses" and on reports Failures metric is used instead to show only these errors that are critical. For further analysis of a failure reason and related errors or responses, drill down to the detailed error reports, including the new Application Responses report, either directly from Failures (total) column or the drill down menu .

The availability is reported by means of the following metrics available in the DMI Data Views.

Availability (total)

The percentage of successful attempts, calculated using the following formula:

Availability (total) = 100% * (All Attempts – All failures) / All Attempts

where

All attempts = all failures + all successful operations + all standalone hits not classified as a failure + all aborts not classified as a failure

All failures = all failures (transport) + all failures (TCP) + all failures (application).

Availability (application)

Availability limited to the application context, calculated using the following formula:

Availability (application) = 100% * (All Attempts – Failures (Application) / All Attempts

where

All attempts = all failures + all successful operations + all standalone hits not classified as a failure + all aborts not classified as a failure.

Availability (TCP)

Availability limited to the network context, calculated using the following formula:

Availability (application) = 100% * (All Attempts – Failures (TCP) / All Attempts

where

All attempts = all failures + all successful operations + all standalone hits not classified as a failure + all aborts not classified as a failure.

Availability (transport)

Availability limited to the transport context, calculated using the following formula:

Availability (application) = 100% * (All Attempts – Failures (Transport) / All Attempts

where

All attempts = all failures + all successful operations + all standalone hits not classified as a failure + all aborts not classified as a failure.

Failures (total)

The total number of failures, that is all Failures (transport) + all Failures (TCP) + all Failures (application)

Failures (application)

The number of operation attributes of all types set to be reported as an application failure.

Failures (TCP)

The total number of operations that failed due to Connection refused or Connection establishment timeout errors.

Failures (transport)

The number of operations that failed due to the problems in the transport layer. These include protocol errors, SSL alerts classified as a failure, incomplete responses selected be classified as failures.

Health index

Metric that includes aspects of performance and availability. Calculated as percentage of fast (and successful) operations to all attempts.