WAN optimization can only be as efficient as you let it be. Recently I’ve seen a good example that illustrates how, without diligent application performance management, multi-million WAN optimization investments can quickly turn into a lukewarm “implementation completed” outcome, to say the least.

The challenge at one of our customers started in a typical way: a large company, with operations around the globe, facilitated by a large SAP ERP and CRM landscape, was constantly hit by complaints about slowness of the vital SAP systems, which increased to a level that raised concerns of the management team. The IT team has been tasked with addressing the issues, provided an adequate budget and given a very simple goal: end users should no longer complain about SAP slowdowns affecting their productivity.

Investments went into the most prominent areas: speed up the worldwide network and take the whole application delivery chain under the APM control. Dynatrace became a partner in the project and delivered complete end-to-end performance monitoring of all SAP applications, with comprehensive visibility into transactions and users known by names, for all global locations. In fact, all applications, including SAP, web and non-web apps are now monitored for performance, usage and availability thanks to the flexibility and scalability of Dynatrace DC RUM.

DC RUM quickly proved that the SAP systems were scaled properly and behaved predictably, with no major slowdowns incurred within the data center part of the application delivery chain.

The network team partnered with their WAN provider and implemented WAN optimization controllers from Cisco – the Cisco WAAS devices have been installed at all vital locations where the centrally hosted SAP applications are delivered. The project was completed, all WOCs operational and configured to optimize all network traffic.

However, the end users didn’t stop complaining about the application performance. For them hardly anything changed. Now what?

The optimized problem

A key ingredient of the Dynatrace APM offering is the Guardian service, targeted at implementing the best APM practices at our customers. The on-site Guardians stood up to the challenge and took a deep dive into the efficiency of the whole application delivery chain – including the WAN optimization technology. Dynatrace DC RUM measures network traffic on both the LAN and WAN sides of the WAN optimization controllers and reconstructs the exact flow of each transaction through the optimized network and the data center network. DC RUM does this for every application on the network and for every user, in both Cisco WAAS and Riverbed Steelhead environments.

Diagram 1
DC RUM’s network probe monitors both sides of the data center WOC

DC RUM understands specifics of the WAN optimization controllers’ protocols on the WAN side and thus precisely measures how each individual application transaction is optimized and delivered to the remote client. With such deep knowledge of the data flow on the application protocol level, DC RUM quickly identified some unexpected effects.

Sure, the WAN optimization and application acceleration technologies helped the typical target apps: the customer’s SMB and HTTP traffic was optimized well, with compression levels assuring traffic on the WAN side decreased, leaving more bandwidth for other critical apps like SAP GUI and SAP portal apps delivered over HTTPS.

However, negative network traffic reduction levels were observed for the most important apps. In other words, there’s more traffic on the WAN side than on the LAN side in the data center. This is not a desired effect of the WAN optimization, especially for apps that are the primary target of the performance improvements efforts.

Negative traffic compression ratio observed on the WAN link indicates inefficient WOC operation for the application of focus
Negative traffic compression ratio observed on the WAN link indicates inefficient WOC operation for the application of focus

DC RUM uncovered a more significant issue that can be observed at the TCP traffic flow level. There are a high number of the Client zero window size signals sent from the remote locations, indicating that the remote WAAS devices are overloaded and can’t process on time all the traffic they should be processing.

screen2
Remote WOC TCP receiver flow control limits throughput

Why is that? A look at the DC RUM’s WAN optimization efficiency report delivers a clear answer:

screen3
Observed compression rates indicate the need for more granular policy configuration
  • Compression of the SAP traffic helps very little – because SAP GUI traffic is by design already compressed.
  • Compression of the HTTPS traffic does not help at all – it actually has a negative bandwidth effect.

Most importantly – compressing the already compressed traffic (SAP GUI and HTTPS) consumes remote WAAS resources, leaving no space for other WAN optimization services of the WAAS. Namely: the Traffic Flow Optimization (TFO) buffers on the remote WAAS cannot be emptied on time because the WAAS CPU is busy uncompressing/compressing the already compressed traffic. This forces TCP flow control to send the Client Zero Window Size events to the peer WAAS, limiting throughput and reducing performance.

The net effect? All WAN traffic is slower now than before the WAAS implementation!

Optimizing the WAN optimization

The remedy – once the data is visible – is simple: disabling compression on specific applications would free up WAAS CPU cycles from compressing what’s already compressed, which would let the WAAS TFO thread to finish its work on time, which would prevent receiving buffers overflow, which would prevent TCP client zero window size events, which would speed up WAN transmission for every app.

Lessons learned

  • Throwing in WAN optimization technology doesn’t solve the network performance problem yet. WAN optimization needs to be tuned in conjunction with the application mix on the network links, and its effects measured in two categories: bandwidth optimization and response time improvements experienced by the end users.
  • Measuring both requires using app-aware performance management tools. Network link utilization measurements don’t reflect what the network carries for whom. Wire data insight doesn’t tell what the end users experience. Only the application flow analytics uncovers true app-network interaction and this requires transactional understanding of the application traffic.
  • Leverage APM specialists. No one can be expert in everything. APM is a team game, so teaming up with the specialists would help achieving the desired results faster. In this case the falcon APM eye of the Dynatrace guardian spotted the WAN optimization inefficiency and triggered the corrective action based on objective end user experience measurements.

With the right APM tooling that relies on the application’s network protocol decodes you can understand how applications interact with the network. With this knowledge you can tune the WAN optimization techniques to the application specifics and thus optimize the WAN optimization for the desired effects: improving the end user experience with the applications and achieving cost savings on the WAN bandwidth. You may also find this blog post giving useful advice on WAN optimization approaches.