In Part V, we discussed processing delays caused by “slow” client and server nodes. In Part VI, we’ll discuss the Nagle algorithm, a behavior that can have a devastating impact on performance and, in many ways, appear to be a processing delay.
Common TCP ACK Timing
Beyond being important for (reasonably) accurate packet flow diagrams, understanding “normal” TCP ACK timing can help in the effective diagnosis of certain types of performance problems. These include those introduced by the Nagle algorithm, which we will discuss here, and application windowing, to be discussed in Part VII.
A slightly simplified but useful rule of thumb is as follows:
A receiving node will:
- acknowledge every second packet immediately
- acknowledge a single packet if the Delayed ACK timer expires before a second packet arrives
The Delayed ACK timer typically defaults to 200 milliseconds, at least on Microsoft platforms.
While this behavior is quite common, you may encounter differences; for example, changing the ACK frequency to every packet is an approach to circumventing certain problems, while shorter Delayed ACK timers may be observed in non-Microsoft environments.
The Nagle Algorithm
The Nagle algorithm was designed to help reduce network overhead by delaying the transmission of a small packet (i.e., <MSS) until all previously transmitted packets have been acknowledged. The goal was to prevent a node from transmitting many small packets if the application delivers data to the socket rather slowly. The usefulness of the algorithm and the frequency with which it is applied have both diminished dramatically; however, you may still encounter it, as the option still exists in most environments.
To begin to understand the impact of the Nagle algorithm, consider a request or reply flow; chances are, the size of the payload in the flow is not an exact multiple of the MSS. Therefore, the last packet of the flow will be smaller than the MSS. With Nagle enabled, this last packet will not be transmitted until the previous (penultimate) packet has been acknowledged. In the best case, the penultimate packet represents an even-numbered packet in the flow, triggering an immediate acknowledgement from the receiver which in turn “releases” the final small packet. In this case, the Nagle penalty is equal to one network round-trip for the entire flow. Should the penultimate packet be an odd-numbered packet, it will not be acknowledged by the receiver until the Delayed ACK timer expires; the penalty becomes one network round-trip plus approximately 200 milliseconds.
For larger flows – consider a file download – adding one network round-trip, or adding a round-trip plus a delayed ACK timer delay, is likely insignificant. But for operations characterized by smaller flows, adding these delays to each flow can have a dramatic impact. Consider a database connection (where Nagle might be more common); query responses that are delivered in an even number of packets will each incur a Delayed ACK penalty; SQL statements that may actually complete in less than 1 millisecond may take hundreds of milliseconds.
Fortunately, Nagle is not frequently enabled; more recent sightings include a database connection and a financial application. If you suspect Nagle (or just want to check), look for the TCP_NODELAY configuration parameter for the environment. An application can also ensure that Nagle is not enabled by using the TCP_NODELAY socket call as the socket connection is opened.
Have you experienced poor performance because of the Nagle algorithm?
In Part VII, we’ll look at TCP Window sizes and the Bandwidth Delay Product, particularly interesting for high-bandwidth high-latency networks. Feel free to comment below.