In Part IV, we wrapped up our discussions on bandwidth, congestion and packet loss. In Part V, we examine the four types of processing delays visible on the network, using the request/reply paradigm we outlined in Part I.
Server Processing (Between Flows)
From the network’s perspective, we allocate the time period between the end of a request flow and the beginning of the corresponding reply flow to server processing. Generally speaking, the server doesn’t begin processing a request until it has received the entire flow, i.e., the last packet in the request message; similarly, the server doesn’t begin sending the reply until it has finished processing the request. We sometimes refer to these delays between flows as “pure” processing delays, distinct from another type of intra-flow processing delay we call starved for data and discuss later. Server processing delays occur as a result of a request message, and therefore always occur within a thread.
Transaction Trace Illustrations
These “pure” server processing delays are generally relatively simple to detect, to understand, and to prove. Transaction Trace’s Node Processing table lists all of the observed processing delays for an operation in tabular format; by splitting this table with the Bounce Diagram and highlighting a row of interest, the Bounce Diagram will display the last packet of the request flow and the first packet of the corresponding reply flow, effectively diagramming the measurement.
You may also use the Thread Analysis split with the Bounce Diagram; this will provide a view of the request and reply packet flows as well as the processing measurement.
Starved for Data (Sending Node, Within a Flow)
Sometimes, the network interface will be able to transmit data at a rate faster than the sending application can deliver to the TCP socket. For example, a busy ftp server may momentarily interrupt sending a large file because of a disk, memory or CPU bottleneck. We refer to these pauses that occur in the middle of a flow as “starved for data” conditions; there is nothing on the network (no TCP flow control constraint) preventing the request or reply flow from continuing, so the cause must be internal to the sending node. Starved for data bottlenecks occur within a flow (instead of between flows), and are related to the sending node – either the client or server.
Transaction Trace Illustration
These cases can be more difficult to visualize. Since the condition is generally not too common, it is often best to rule out other performance bottlenecks first, before checking for data starvation. When it does occur, the condition has the effect of extending the duration of a request or reply flow, and starved for data delays are included in Transaction Trace’s Node Sending measurements. Sort the rate column of the Node Sending Table and split the window with the Bounce Diagram; the Bounce Diagram will illustrate the packets associated with a sending measurement. For those sending measurements where you suspect a starved for data condition, look for idle periods of time where the sending node’s flow has been interrupted. Importantly, a starved for data delay will terminate with the transmission of a data packet that resumes the sender’s flow, not a TCP ACK from the receiver that might suggest a TCP or application window constraint.
Client Processing (Between Flows)
From the network’s perspective, we allocate the time period between the end of a reply flow and the beginning of the next request flow to client processing. Generally speaking, the client cannot begin processing a reply until it has received the entire reply flow, i.e., the last packet of the reply message; similarly, the client doesn’t begin sending the next (new) request until it has completed processing the reply. (This correlation generally applies to request/reply flows on the same TCP connection.)
Transaction Trace Illustrations
Similar to server processing delays, client delays are relatively simple to understand. In most cases, client delays occur between threads; in other words, after one thread has completed but before the next thread begins. For tasks with thread-level decodes, Transaction Trace’s Thread Analysis Gantt chart view can illustrate these delays well.
Discounting Client Processing
Note that we assume strict adherence to the definition of “operation” here; click to screen update. If the trace has captured multiple steps – say the user navigates through a series of operations – then the user “think time” between steps will appear as client processing delay, with corresponding gaps between threads. You may still use these multi-step tasks for analysis, remembering to discount client processing delays. Alternatively, you can save each step as a separate task by selecting a sequence of threads from the Thread table.
Receiver Flow Control (Within a Flow)
The final of our four processing delay types is a function of TCP’s sliding window protocol. The receiver’s window advertisements inform the sender how much buffer space is available for the connection – effectively, how much data can be sent safely without worrying about buffer overflow. As packets arrive they are acknowledged as successfully received (the TCP ACK number); as data is read from the receive buffer, space is freed and advertised to the sender (the TCP window size). Often, these two events happen simultaneously (at least from what we can see on the network), and each ACK packet also advertises a full receive window. However, if the receiving node experiences a delay reading the data from the buffer, the acknowledgement packet will advertise a reduced window size. In the extreme case, the receiving node runs out of buffer space and will advertise a window size of 0, effectively halting the flow. Once buffer space becomes available, the receiving node will send a window update to inform the sender. In this way, the receiver can control the flow of data from the sender.
Avoiding Silly Windows
The “silly window syndrome” is a set of algorithms employed to prevent silly windowing behavior – namely, sending of a small packet intended to fill a small amount of remaining space in the receiver’s buffer. To avoid this, the sending node will treat an advertised window size smaller than the maximum segment size (a common MSS value is 1460 bytes) the same as a window 0 event; that is, it will stop transmitting and wait for a larger advertised window.
Transaction Trace Illustration
To evaluate the impact of receiver flow control, use the Time Plot view. For Series 1, graph the receiver’s advertised TCP window size. For Series 2, graph the sender’s Cumulative Payload sent; use a separate y-axis. (Make sure you are graphing a single TCP connection or a series of sequential TCP connections, not a collection of parallel connections.) Look for small or 0 byte windows, and note the corresponding impact on throughput.
Receiver flow control bottlenecks, like starved for data conditions, occur within a flow (instead of between flows), and are related to the receiving node – either the client or server.
Processing Delay Type Summary
|Processing Type||Flow Relationship||Associated with|
|Server processing||Between flows||Server|
|Starved for Data Processing||Within a flow||Sender|
|Client processing||Between flows||Client|
|Receiver flow control||Within a flow||Receiver|
Corrective actions for server-side processing delays will be dictated by application and server monitoring solutions. These may include application changes, server tuning, increasing server performance, and may be dependent on backend tier delays; Dynatrace is of course the best example of this. It is quite helpful to provide as much application-level detail as possible when collaborating with the application team; Transaction Trace’s Thread Analysis provides an excellent handoff.
Client-side corrective actions mirror those suggested for server processing bottlenecks; visibility into the logical and physical performance constraints on the client machine can be gained by using a performance analysis monitor or tool set. For browser-based clients, tools like dynaTrace Ajax Edition, Google PageSpeed and ySlow (all free) provide helpful performance tuning insight.
You should treat receiver flow control delays – “Window 0 events” – the same as other processing delays, examining code execution bottlenecks. This behavior is not an indication of a network or TCP configuration problem; increasing the receive window size may allow the sender to transfer the data more quickly, but this will not speed the response time; instead of sitting in the transmit queue, the data will sit in the receive buffer.
How do you monitor, report and manage client and server delays in your network?
In Part VI, we will look at the dreaded Nagle algorithm. Stay tuned and feel free to comment below.