Introducing SAP is a cost that makes everyone very cautious about the return of this investment. Performance problems with SAP delivery can quickly propagate to seriously affect business operations and lead to tedious war room scenarios. End user complaints will usually point towards the SAP team, the SAP team will blame the network, and IT operations team will fight back. But what if the root cause of the problem is something altogether different.
Inability to quickly determine the root cause of the performance problem often leads to war room scenarios, especially when the money invested in failing application comes from public funds. Our client, Nimrod, a country-wide government agency from Republic of Razkavia (names changed for commercial reasons) made decision to implement its key applications based on the SAP infrastructure. When employees from Rosecoast, a harbor city in Razkavia, started to complain about the performance problems with one of SAP applications the operations and SAP teams started to investigate the problem.
Performance Problems at Rosecoast
Almost hundreds of thousands of Nimrod employees use SAP GUI applications each day. The IT operations team has set up Application Aware Network Performance Monitoring (aaNPM), part of Dynatrace suite, to monitor end-to-end performance of SAP applications delivered to Nimrod employees.
One day, however, the head of IT operations and CIO received complaints from the business department, located in Rosecoast, that its employees were experiencing performance problems with one of the SAP applications. These problems impacted productivity of the business department and were incurring additional costs of operation. According to the manager of the business department all employees at Rosecoast were affected by these performance problems.
The Operations team used the aaNPM tool to check health of the reported application experienced at Rosecoast office. As Figure 1 shows, however, neither of SAP servers was experiencing performance problems: Operation time with breakdown metric shows that server times are not too long.
Figure 1. Performance overview across all servers delivering SAP
In the next step the Team compared SAP performance across all locations. The network metrics, i.e., RTT and Server Loss Rate, indicated that there were no network problems at either location (see Figure 2). At Rosecoast, however, the aaNPM tool used by the Operations team confirmed that the total operation times were longer than at other locations.
Figure 2. There are no network problems at Rosecoast, but many Zero Window Size events may indicate issues at the client side
These problems seemed to happen only at Rosecoast, and could be attributed to neither network nor application problems. Many Zero Window Size events, however, hinted on hardware or software performance problems at the client machines. The team used Advanced Diagnostics Server (ADS) from the aaNPM suite to analyze the load sequence of one SAP TCode operation executed at Rosecoast by one of the users. The report (see Figure 3) showed that most of the time spent in completing the operation was spent at the client application that was busy processing data.
Figure 3. Operation load sequence report for one of SAP TCodes executed at Rosecoast location, shows long idle times when the client application is busy processing received data before requesting more input
The big question now was: what has changed recently at Rosecoast to either the hardware or software at employee workstations, which could explain this sudden performance problems leading to long delays in executing SAP applications.
The Operations team looked for the cause of the problem by reviewing the hardware and software change logs. They figured that most of the PC workstations used by the business department at Rosecoast were recenlty upgraded to new ones. After closer examination the Team has discovered faulty memory chips in all new workstation. The complaints from the business operations department stopped once all faulty RAM modules had been replaced.
The Operations team from Nimrod used Dynatrace to manage performance of SAP applications delivered across organization. Since, due to government policies, the end-to-end SAP GUI communication is encrypted using SAP SNC (Secure Network Communications) protocol, the aaNPM network probe has been setup to decrypt SNC-protected communication to ensure visibility into performance across complete application delivery chain. The Operations team has enabled synthetic scripts to ensure both passive and active monitoring of crucial components of delivered applications. The Team has also setup a number of alerts to pro-actively monitor against potential performance degradation problems. Although the problem was otherwise hard to spot without feedback from end-users, the Team avoided the war room scenario by quickly zeroing in on the root cause of performance problems experienced at the Rosecoast office; Instead of being stuck at an inconclusive report on the health SAP servers, they were able to tell that the root cause was not of the network or application, but in fact a faulty hardware.
In the report “Are Users More Aware of SAP Performance Than IT”, Gartner indicates that measuring the end-user experience is key to proactive monitoring and delivering expected quality of SAP service levels. Gartner points towards Compuware’s Application Aware Network Performance Monitoring (aaNPM) as a comprehensive and SAP certified solution that can monitor both SAP and non-SAP application components, including integration tiers between SAP and external applications, as well as communication to the backend SQL database. Compuware APM provides end-to-end visibility of enterprise applications and thanks to passive monitoring of network communication is easier to deploy than agent-based approaches.
(This blog post is based on materials contributed by Krzysztof Ziemianowicz based on original customer story. Some screens presented are customized while delivering the same value as out of the box reports.)