Application user experience

Aborted transactions

The number of aborted transactions due to the HTTP timeout. This metric is calculated only for Client tiers.

Aborts

The number of aborted operations. This metric is calculated only for the Network tier, the Client network tier, Client optimized network tier and data center tiers.

Affected users (availability)

The number of unique users that were affected by TCP availability problems. For Client optimized network, Client network, and Network tiers, this metric is not calculated.

Affected users (network)

The number of unique users that experienced network performance problems.

Affected users (performance)

The number of unique users that experienced application performance problems or network performance problems. For Client optimized network tier, this metric is not calculated.

Application Monitoring Operations

The number of Application Monitoring operations.

Application health index

The percentage of fast operations calculated as "Fast Operations / (Failures + Operations) * 100%".

Availability (TCP)

Availability limited to the network context, calculated using the following formula:

Availability (application) = 100% * (All Attempts – Failures (TCP) / All Attempts

where All attempts = all failures + all successful operations + all standalone hits not classified as a failure + all aborts not classified as a failure.

Availability (application)

Availability limited to the application context, calculated using the following formula:

Availability (application) = 100% * (All Attempts – Failures (Application) / All Attempts

where All attempts = all failures + all successful operations + all standalone hits not classified as a failure + all aborts not classified as a failure.

Availability (total)

Depending on the particular tier, this term may mean:

  • For Client tiers: the percentage of successful attempts, calculated as 100-100*(failures/attempts).

  • For the Citrix/WTS (presentation) tier: the percentage of successful TCP connection attempts, calculated as 100-100*(failures/attempts).

  • For other Network tiers: the percentage of successfully sent packets, calculated as 100-100*(sent packets that were lost/total number of sent packets).

  • For other Data center tiers: the percentage of successful attempts, calculated using the following formula: Availability (total) = 100% * (All Attempts – All failures) / All Attempts

    where All attempts = all failures + all successful operations + all standalone hits not classified as a failure + all aborts All failures = all failures (transport) + all failures (TCP) + all failures (application).

Availability (transport)

Availability limited to the transport context, calculated using the following formula:

Availability (application) = 100% * (All Attempts – Failures (Transport) / All Attempts

where All attempts = all failures + all successful operations + all standalone hits not classified as a failure + all aborts not classified as a failure.

Average CPU utilization

The percentage of elapsed time that the processor spent to execute non-idle threads. This counter is the primary indicator of processor activity, and shows the average percentage of busy time.

Average memory utilization

The average percentage of used physical memory (RAM).

Client RTT

Client RTT is the time it takes for a SYN packet (sent by a server) to travel from the AMD to the client and back again, as shown in the following picture.

graphical illustration
graphical illustration

A client RTT measurement begins when the SYN ACK packet from the server to the client passes by the AMD (T5). The packet reaches the client machine (T6) and is processed, while an acknowledgment is sent back to the server (T7). Client processing time impact (T7-T6) is again very low. Client RTT measurement ends when the ACK packet reaches the AMD (T8). Therefore, the Client Round Trip Time is calculated as T8-T5. Depending on the actual setup, Client RTT measurements may vary dramatically. In corporate environments, it may be a few milliseconds for LAN-connected clients or a couple dozens milliseconds for WAN-connected clients. In this case, where the client is coming from the Internet, the end-to-end Client RTT measurement is a compound of transit time through the Internet backbone as well as through the "last mile" access network. The impact of the last mile can be easily calculated, based on the connection speed and the packet size (56B in case of TCP SYN packet). For a 28 kbps dial-up connection, this amounts to 16 milliseconds one way, or 32 milliseconds for a complete round-trip measurement. For a 1.6 Mbps DSL line, this makes 56 microseconds towards complete client RTT measurement.

Client Volume

The number of client transmitted bytes.

Client loss rate

The percentage of total packets sent by a client that were lost and needed to be retransmitted. This metric is calculated only for the following tiers: RUM sequence transactions, Citrix/WTS (presentation), Client optimized network (for WAN and Pass-through deployment only), and tiers based on TCP-based analyzers.

DNS errors

The number of DNS errors.

Database errors

The number of database errors in the database analyzer:

  • For TDS, which includes Sybase and MS SQL Server, any value from the following table is considered an error.

  • For MySQL, if an ERR_Packet is returned, the error count is incremented.

An error with a severity level of 19 or higher stops the execution of the current SQL batch and the error message is written to the error log.

Errors that can be corrected by the user :

11: The given object or entity does not exist.

12: SQL statements that do not use locking because of special options. In some cases, read operations performed by these SQL statements could result in inconsistent data, because locks do not guarantee consistency.

13: Transaction deadlock errors.

14: Security-related errors such as permission denied.

15: Syntax errors in the SQL statement.

16: General errors that can be corrected by the user.

Software errors that cannot be corrected by the user and that require system administrator action :

17: The SQL statement caused the database server to run out of resources (such as memory, locks, or disk space for the database) or to exceed some limit set by the system administrator.

18: There is a problem in the database engine software, but the SQL statement completes execution, and the connection to the instance of the database engine is maintained. System administrator action is required.

19: A non-configurable database engine limit has been exceeded and the current SQL batch has been terminated.

System problems :

20-25: Fatal errors, meaning that the database engine task that was executing a SQL batch is no longer running. The task records information about what occurred and then terminates. In most cases, the application connection to the instance of the database engine also terminates. If this happens, depending on the problem, the application might not be able to reconnect.

Database warnings

The number of database warnings in the database analyzer:

  • For TDS, which includes Sybase and MS SQL Server, this count will always be zero. TDS does not track anything as a warning.

  • For MySQL, if an OK_Packet is returned, the warning count value in that packet is checked and the total warning field is updated with the returned number.

End-to-end RTT

The time it takes for a SYN packet to travel from the client to a monitored server and back again.

Failures (TCP)

The number of operations that failed due to one the TCP errors.

Failures (application)

The number of operation attributes of all types set to be reported as an application failure.

Failures (total)

The total number of failures: Failures (transport) + Failures (TCP) + Failures (application)

Failures (transport)

The number of operations that failed due to the problems in the transport layer. You configure the failures (transport) to include the following: protocol errors, SSL alerts, aborts and incomplete responses.

Fast operations/transactions

The number of operations or transactions for which the execution time was below a predefined threshold value. These include HTTP/HTTPS page loads, SQL database queries, XML (transactional services) operations, e-mails, DNS requests, Oracle Forms submissions, MQ operations, MS Exchange operations, SAP operations, transactions (for RUM data).

HTTP errors

The number of observed HTTP client errors (4xx) and server errors (5xx).

Idle time

The part of the operation time spent between receiving a part of the response and requesting a subsequent part. It enables you to isolate the time taken by client from the time when the data was still being transmitted on the network. For RUM sequence transactions, the idle time is equal to the client time.

Incomplete responses

The number of incomplete responses, that is partial and server aborted responses, as well as situations when a server did not respond to the request at all or responded in an unrecognizable way.

LDAP errors

The number of LDAP Errors. The LDAP Errors are reported in the following categories:

  • LDAP critical errors

  • LDAP server errors

  • LDAP security errors

  • LDAP syntax errors

  • LDAP client error

  • LDAP client error

Long aborts

For HTTP, this is the number of operations manually stopped by the user by either clicking on the Stop or Refresh buttons or selecting another URL after at least 8 seconds of waiting for the page download (8 seconds is default). For XML, this is the number of transactions stopped after at least a threshold number of seconds of waiting (8 seconds is the default).

MQ appl. errors

The number of operation attributes of all types set to be reported as MQ application errors for software services based on an MQ analyzer.

MQ errors

The total number of IBM WebSphere Message Queue errors, including client errors, server errors, protocol errors and security errors.

MS Exchange errors

Total number of RPC server and RPC protocol errors. Counted only for MS Exchange analyzers.

Network performance

The percentage of total traffic that did not experience network-related problems (traffic in which the values of loss rate and RTT did not exceed configured thresholds).

Network time

The time the network takes to deliver the request to the server and to deliver the resulting response back to the user. In other words, network time is the portion of the operation time that is spent on transferring data over the network.

Operation attributes

The number of operation attributes of all types (type 1 to 5), observed for the given software service.

Operation/Transaction time

The average value of operation or transaction time for all operations or transactions performed on the particular tier.

Operations/Transactions

Depending on the tier definition and on the traffic analyzer used, this metric shows the number of:

  • HTTP(S) operations

  • SQL database queries

  • XML (transactional services) operations

  • E-mail messages

  • DNS requests

  • Oracle Forms submissions

  • MQ operations

  • MS Exchange operations

  • SAP operations

  • Transactions (for RUM data)

Other time

For RUM sequence transactions, the other time is a sum of the client time, the client response time, and the application processing time. For synthetic transactions, the other time is equal to the client time. For RUM Browser data, the other time is equal to the client time if provided by the Application Monitoring server. The other time is not calculated for Dynatrace Performance Network data.

Performance

Depending on the particular tier, the term performance can mean:

  • For Client tiers: the percentage of transactions completed in a time shorter than the defined time threshold, calculated as 100-100*(slow transactions/all transactions).

  • For the Client optimized network tier: the percentage of compressed bytes.

  • For other Network tiers: the percentage of total traffic that did not experience network-related problems.

  • For Data center tiers: for transactional protocols, this is the percentage of software service operations completed in a time shorter than the performance threshold. For transactionless, TCP-based protocols, this is the percentage of monitoring intervals in which user wait time per kB of data was shorter than the threshold value.

RMI/Universal decode errors

Total number of RMI/Simple parser errors.

RTT measurements

The number of RTT measurements. An RTT measurement occurs during every TCP handshake, so it provides some insight into the number of attempted TCP sessions, and the potential accuracy of the RTT measurements that are reported. This metric is calculated only for the following tiers: RUM sequence transactions, Citrix/WTS (presentation), Client optimized network (for LAN only), and tiers based on TCP-based analyzers.

Realized bandwidth

The actual transfer rate of server data when the transfer attempt occurred. This metric takes into account factors such as loss rate (retransmissions).

Redirect time

The average amount of time that was spent between the time when a user went to a particular URL and the time this user was redirected to another URL and issued a request to that new URL. The difference between Redirect Time and HTTP Redirect Time is that the former counts all operations, while the latter refers only to those operations for which redirection actually took place.

Response messages

The total number of protocol-specific server responses. That includes both errors and other identifiable response strings, as configured in monitoring.

SAP errors

The number of errors detected on the protocol level in communication between SAP application server and SAP GUI client as well as between SAP application server and a third party clients using Remote Function Calls (RFC).

SMTP errors

The total number of SMTP errors.

SSL errors

The number of all SSL alerts. This metric is the sum of SSL errors 1, SSL errors 2, and Other SSL errors.

Server RTT

The time it takes for a SYN packet to travel from the AMD to a monitored server and back again. This metric is calculated only for the following tiers: RUM sequence transactions, Citrix/WTS (presentation), Client optimized network (for LAN only), and tiers based on TCP-based analyzers.

graphical illustration
graphical illustration

Server TCP data packets

The total number of TCP packets sent by the servers, excluding the traffic control packets. This metric is calculated only for the following tiers: RUM sequence transactions, Citrix/WTS (presentation), Client optimized network (for LAN only), and tiers based on TCP-based analyzers.

Server Volume

The number of server transmitted bytes.

Server loss rate

The percentage of total packets sent by a server that were lost - between the AMD and the server - and needed to be retransmitted. This metric is calculated only for the following tiers: RUM sequence transactions, Citrix/WTS (presentation), Client optimized network (for WAN and Pass-through deployment only), and tiers based on TCP-based analyzers.

Server time

The time it took the server to produce a response for the given request.

Short aborts

The number of transactions stopped before timeout. For HTTP, this is the number of page loads software service manually stopped by the user by either clicking on the Stop or Refresh buttons or selecting another URL before 8 seconds of waiting for the page download (8 seconds is default). For XML, this is the number of transactions stopped before a threshold number of seconds of waiting (8 seconds is the default).

Slow operations (application design - # of components)

The number of slow operations caused by the number of components, which is one of the detailed reasons in the application design category as calculated using the primary reason for slowness algorithm. Note that this includes only successful operations. Failures and aborted operations are not taken into account.

Slow operations (application design - redirect time)

The number of slow operations caused by redirect time, which is one of the detailed reasons in the application design category as calculated using the primary reason for slowness algorithm. Note that this includes only successful operations. Failures and aborted operations are not taken into account.

Slow operations (application design - request size)

The number of slow operations caused by request size, which is one of the detailed reasons in the application design category as calculated using the primary reason for slowness algorithm. Note that this includes only successful operations. Failures and aborted operations are not taken into account.

Slow operations (application design - response size)

The number of slow operations caused by response size, which is one of the detailed reasons in the application design category as calculated using the primary reason for slowness algorithm. Note that this includes only successful operations. Failures and aborted operations are not taken into account.

Slow operations (client/3rd party)

The number of slow operations caused by client/3rd party category as calculated using the primary reason for slowness algorithm. Note that this includes only successful operations. Failures and aborted operations are not taken into account.

Slow operations (data center)

The number of slow operations caused by the data center category as calculated using the primary reason for slowness algorithm. Note that this includes only successful operations. Failures and aborted operations are not taken into account.

Slow operations (multiple reasons)

The number of slow operations caused by multiple reasons, that is when the algorithm was not able to determine one primary reason for slowness.

Slow operations (network - latency)

The number of slow operations caused by latency, which is one of the detailed reasons in the network category as calculated using the primary reason for slowness algorithm. Note that this includes only successful operations. Failures and aborted operations are not taken into account.

Slow operations (network - loss rate)

The number of slow operations caused by loss rate, which is one of the detailed reasons in the network category as calculated using the primary reason for slowness algorithm. Note that this includes only successful operations. Failures and aborted operations are not taken into account.

Slow operations (network - other)

The number of slow operations caused by other factors than latency or loss rate, which is one of the detailed reasons in network category as calculated using the primary reason for slowness algorithm. Note that this includes only successful operations. Failures and aborted operations are not taken into account.

Slow operations/transactions

The number of operations or transactions for which the execution time was above a predefined threshold value. These include HTTP/HTTPS page loads, SQL database queries, XML (transactional services) operations, e-mails, DNS requests, Oracle Forms submissions, MQ operations, MS Exchange operations, SAP operations, transactions (for RUM data).

TCP errors

The total number of TCP errors.

Those errors may indicate server or application problems and therefore measurements of those are critical to understanding the issues that may affect end-user experience. AMDs measure and report on the following types of TCP errors:

  • Connection Refused Errors - Client attempts to open a TCP session with a server, which rejects the request. SYN packet from Client is followed by RESET packet from Server, with matching TCP sequence numbers. This error is typically caused by resource exhaustion on the server, which is unable to accept more concurrent TCP sessions. This may be either a configuration issue (too few resources allocated in the kernel) or lack of memory. SYN flood attacks typically result in servers being unable to accept new connections.

  • Server session termination error - Server is unexpectedly terminating a connection that was successfully opened. The server sends a RESET packet to the Client. Such an error originates at an application using TCP session that is monitored. It does not necessarily mean application failure; usually it means that the application encountered a condition in which it decided to immediately terminate session with the client, for example, because of an application security policy violation by the client.

  • Session Abort - Client is unexpectedly terminating a connection that was successfully opened. The Client sends a RESET packet to the Server. These errors are inspected in the context of the client application and may or may not be reported. For example, the browser running HTTP may terminate the load of a GIF file if it is older than the one that it had previously cached and this is normal behavior. However, if all connections to the server are terminated because the user hits the STOP button, then this is abnormal session termination and is reported as "Aborted operation" or "Stopped Page".

  • Client not responding errors (server timeout errors) - Server networking stack takes an assumption that the network connection to the client exists, but the client remains idle and does not respond. In such a case, the server closes the TCP session with the RESET packet. Such a condition may occur when the client has been silently disconnected from the network, for example, due to a link failure, or the client has crashed. Note that this error will not occur if the client has ended the session gracefully, e.g. by closing the client application.

  • Server not responding errors (client timeout errors) - Client networking stack takes an assumption that network connection to the server exists, but the server remains idle and does not respond. In such a case, the client closes the TCP session with the RESET packet. This may occur either during the Session Setup phase (no response to the SYN packet), or during a normal data exchange process. Such a situation may result in the intermittent network problems between the client and the server. In the case the traffic is routed through asymmetric paths across the Internet, which is often the case, the path from the server to the client may be broken.

Total bandwidth usage

The number of all transmitted bits (client + server) per second.

Total network time

A difference between Total transaction time and sum of Total server time and Total redirect time. This metric is calculated only for the Data center tiers and for the following dimension combinations: Application-Tier and Application-Transaction-Tier.

Total redirect time

The sum of the averages of redirect time of all operations assigned to a transaction. This metric is used to indicate the redirect time used to achieve the result of multi-step transactions. It is calculated only for Data center tiers and for the following dimension combinations: Application-Tier and Application-Transaction-Tier.

Total server time

The sum of the averages of server time of all operations assigned to a transaction. This metric is used to indicate the server time used to achieve the result of multi-step transactions. It is calculated only for Data center tiers and for the following dimension combinations: Application-Tier and Application-Transaction-Tier.

Total transaction time

The sum of the averages of operation time of all operations assigned to a transaction. This metric is used to indicate the total time used to achieve the result of multi-step transactions. It is calculated only for Data center tiers and for the following dimension combinations: Application-Tier and Application-Transaction-Tier.

Transaction errors

The number of errors that originate from Synthetic Monitoring transactions or RUM sequence transactions.

Transactional service errors

The total number of transactional service errors.

Two-way loss rate

The average loss rate calculated for both directions. The sum of client and server retransmitted packets averaged by the sum of total client and server packets.

Unique users

The number of unique users detected in monitored traffic. Note that for RUM Browser the notion of users refers to visits.

Volume

The number of all transmitted bytes (client + server).