APM (Application Performance Management) drama in three acts.


Dramatis personæ:

Bart – somewhat confused Operations Manager

Mike – Network Administrator, and a good man in a good place

Act 1 – The Pain is Shared

The scene takes place in the corporate building. A small alcove in the corridor reveals a kitchen unit with coffee machine, fridge and a sink. It’s Monday morning.
Bart (somewhat confused Operations Manager) is trying to make the coffee machine work, with no success. He is pressing buttons furiously.

Bart (speaking to the coffee machine): Why don’t you work? I need my coffee! I cannot think without my coffee.

Mike (emerging from the corridor, approaching the kitchen unit): Hi there, mate. How’s life? How was your weekend?

Bart (still pushing the buttons): It’s all good. How are you?

Mike: Doing pretty well, thanks. It seems you are bothered with something, though.

Bart (straightening up): Well, yes. In fact I am. Every day in the morning I drink my cup of coffee. It keeps me going through the day. But today I cannot force this contraption to make a simple cup of coffee for me.

Mike (opening a fridge): That sounds serious, so what are you going to do?

Bart (scratching his head): I do not know. That day just started off badly and it is going to get even worse. In the afternoon I have our key business applications’ status briefing and how my team is helping in maintaining the proper availability and responsiveness of it. And well, … I do not have good news. Without my coffee I will not be able to report on the status properly and the VP will tear me to pieces.

Mike: Maybe I can help somehow? Did you try to fix the machine already?

Bart: Yes, I poured the water in, added coffee beans; I even made sure the filter is clean. I flipped the switch many times but it still does not work.

Mike: That sounds like you have checked all the main suspects. I think I know what the answer is, though. Let me show you something. Leave the cup here and follow me.

 (they walk away; Mike is leading Bart to his office)

Act 2 – A Visit to the Parallel World of Application Performance Management

The scene takes place in Mike’s office. The walls are covered with framed certificates of excellence, and awards. Mike walks behind his desk and points to a chair in front.

Mike (walking to sit behind his desk): Please, take a sit.

Bart (sitting down): I still do not know why you brought me here. It will not help me have my morning cup of coffee….

Mike (smiling): Quite the contrary. Please read this. (hands over a small book to Bart)

Rudeyard Kipling
Rudyard Kipling

Bart (putting his reading glasses on): “Poems by Rudyard Kipling”. Are you kidding me?

Mike: Trust me. Read page 7.

Bart (giving up): All right! (scans the book). Page 5…6…7, there it is. “Six Honest Men”. Mike, really?… These are poems for kids… Well, all right… (reads from the book) “I keep six honest serving-men, (They taught me all I knew); Their names are What and Why and When And How and Where and Who.”. I have no idea what it means.

Mike: Bart, it is all very easy. Think about your afternoon meeting. You said earlier that you are responsible for the performance of the business applications.

Bart: Yes?…

Mike: As in case of a coffee machine, the business application is a complicated construction with a lot of moving parts.

Bart: That is correct. We have web frontend, load balancers, a number of Java application servers which communicate with each other using an MQ queues. The data itself is stored within a cluster of SQL Servers…. Yes, this is a decent environment.

Mike: And how do you assure correct performance of the application?

Bart: Well, we monitor all the tiers of the application. We have agents on the browsers, web servers, inside of the application server code, as well as a number of host monitoring agents. We can follow the transaction from the browser through all the tiers to the data center. If any of the tiers is underperforming we send a squad of engineers to remove the problem. And because we track all the historical data as well we can extrapolate from the trends and prepare for performance degradation even before it actually hits us. We have all the bases covered.

Mike (smiling): That’s impressive. However…

Bart (frowning): Oh come on, don’t tell me you can improve it just by listening to this short description!

Mike: I think you have forgotten about one fundamental building blocks of your business application.

Bart: I am pretty sure I did not. What do you mean?

Mike: Let me give you a hint: what is the common element of all the tiers of your application that allows the communication?

Bart: The network, of course, but this is not what you are asking me about, are you?…

Mike: This is exactly it, Bart. The network spans across all the tiers of your application. It is an underlying platform which assures all the communication. What do you do to monitor the network?

Bart: Well, we monitor the communication between the services of our business applications.

Mike: That is very good. But what if your application is impacted by the network resource consumption caused by external sources? You should think of your network as of another tier of your business application with its own logic and resources, and that means you need to know who is using it, what is being done with it, how much traffic is being sent between endpoints and locations, where the traffic originates from and where it is going, and ultimately why you observe communication response time degradation. This may be related to network errors, network services being down, link oversaturation, collisions, virus activity and many more.

Bart: Yes… this all makes sense. This reminds me of a situation we had some time ago which was initially reported to be a SQL Server performance problem. Our monitoring tools told us that the SQL queries take long to execute. It took us some time to realize that the problem was not really within the SQL Server and database construction, but was related to network usage: the same SQL Server was also being used by another non-business critical application which was consuming 80% of the network and host resources.

Mike (showing the screens on his monitor): Let me show you something. This is a set of Network Performance Monitoring reports I sometimes use in my business unit. They are part of Compuware’s dynaTrace Data Center Real-User Monitoring solution. On these screens you can see the top 10 most active applications. This screen shows the most active servers and I can drill down to all the clients connecting to them. And here you can see the remote offices’ traffic as it hits the datacenter. I see data volumes, TCP errors, response times, listed by network application name, protocol, servers, QoS, locations, and more.

Network Performance Report gives a good overview of things like network bandwidth consumption per site, servers and end users
The Network Performance Report gives a good overview of things like network bandwidth consumption per site, servers and end users

Bart: This is really impressive. This broadens the visibility of our tools so much. I can see many more potential uses for it, like network capacity planning, tiers reorganization, QoS management and so on. I must inform my team about these reports. The data they provide is going to significantly reduce the down time in case of network problems and help us plan for network growth in the future. I wish these reports were integrated with dynaTrace PurePath tools we are using.

Analyzing Network Performance per Office Locations to pinpoint bandwidth, latency or connectivity issues
Analyzing Network Performance per Office Locations to pinpoint bandwidth, latency or connectivity issues

Mike: They already are. DCRUM provides insight into network performance, but it is also an End User Experience and Application Performance Monitoring tool. Take a look. On this DCRUM report I can see a breakdown of all URLs monitored 24 hours 7 days a week for all users. As you can see some of the operations are reported to be very slow. With a single click I can jump from DCRUM report directly into dynaTrace Deep Application Transaction Management. The PurePath technology allows me to analyze the root cause of the transaction slow down at the code-level.

Identifying slow requests that impact end user experience. From here we can drill to the full end-to-end view including code-level visibility
Identifying slow requests that impact end user experience. From here we can drill to the full end-to-end view including code-level visibility
The End-to-End transaction view pinpoints the problematic application tier and code
The End-to-End transaction view pinpoints the problematic application tier and code

Bart: I am even more impressed. With the combination of agents and network probes I can achieve a true depth and breadth of the visibility. Thanks Mike!

Mike: You are welcome. Do you understand why I brought you here now?

Bart: You were very helpful. But I still do not get the part about the Kipling’s poem. How did it go?

Mike: Let me quote it to you. I know it by heart: “I keep six honest serving-men, (They taught me all I knew); Their names are What and Why and When And How and Where and Who.”

Bart: I think I understand. The Six Honest Men from the book are the questions we need to ask to understand our network performance. And the answers are in Compuware’s Data Center Real User Monitoring in the Network Performance Monitoring report set.

Mike: Precisely. I am glad I could be of help.

Bart (leaving through the door): Well, I really must go. It is going to be a busy day. See you later!

Act 3 – The Solution is Found

Mike’s office. Bart has left the room 5 seconds ago. Mike is still sitting behind his desk, and looking at the door with a smile on his face. The door opens and we see Bart’s head poking in again.

Bart: You still did not help me with the coffee machine problem, you know!

Mike: Yes I did. Think about it. We spoke of the fundamental building block of all business applications, Bart. Remember what I said? Your coffee machine is like a business application.

Bart: Yes?…

Mike: Your coffee machine also has a fundamental building block shared between its core components. In case of Business Application this was The Network, and here I am speaking about electricity. What you need to do, Bart is…

Mike and Bartek speak together in one voice:  Plug in the power cable!

..oOo..