In my role as technology evangelist, I spend a lot of time helping organizations, big and small, make their IT systems better, faster and more resilient to faults in order to support their business operations and objectives. I always find it frustrating to “argue” with our competitors about what the best solution is. I honestly think that many APM tools on the market do a good job – each with advantages and disadvantages in certain use cases. There is no “one size fits all” – there is just a “this tool fits best for your APM Maturity Level” (not saying the others wouldn’t do a good job).
A lot of the arguing in the APM space is about the fundamental approach to monitoring application transactions: monitor and capture ALL details vs. monitor and capture relevant details. Along with those come topics like “overhead impact”, “scalability” and “data hording vs smart analytics”.
Ultimately, you want to pick the right tool to solve your problems. As you have multiple tools to choose from let me – in my role as technology evangelist – highlight some of the use cases that our customers solve. As a technologist and a blogger, what I really care about is that the right technology is applied to the right problem. As such, I feel compelled to share what I have learned working with customers in the trenches. Hopefully, this will help you understand the technology and what problem it can solve in real life problems, and cut through the propaganda. Let me start with a few use cases today and follow up with some more in follow up blog posts.
Use Cases from Jan- A Performance Engineer
The first use cases are picked from Jan – whom I reached out to after I read his question on our Dynatrace Community Forum. His company decided to move from a competitor to our APM solution and I wondered why. In an email, he highlighted that he had some initial success with the tool, and had been able to solve a couple of low hanging problems. When they decided to start taking a strategic Continuous Delivery approach to software delivery, they realized that the current tool had certain shortcomings slowing their attempts to practice DevOps.
They identified the following key problems they need to solve and what they really required from an APM solution in order to get to where they are heading:
- How a user got to a problem, and not just seeing the problem itself
- Every transaction, with all details they need, out-of-the-box
- Web request/response bytes, SQL bind values, exception details for every transaction
- Number of transactions executed per user and tenant used for business and cost reporting
- Capture custom business context data for every transaction
- Business transactions based on “buried” context data as not every detail is in the URL
- Eliminate homegrown tools which are costly to maintain
- Provide application as well as system and infrastructure monitoring
- Integrate with other tools such as JMeter, LoadRunner, Jenkins or HP Open View
- Eliminate the need to make people look at other tools and data
- Foster collaboration across Architect, Dev, Test & Ops by using same data set
- Data must be shareable with a single click
- Ability to extend to custom frameworks, systems and protocols
- Bring in custom metrics from external tools via Java Plugin infrastructure
- Follow transactions across any custom protocol or technologies outside Java & .NET
- Full Automation to support Continuous Delivery
- Use Metrics provided by APM for every build artifact along the deployment pipeline to act as quality gateway
- Inform APM about new deployments to prevent false alerting
- Replace traditional application logging
- Eliminated log files which saves I/O and storage
- Get the log messages captured in context of a transaction and the context of the user that triggered that log message
- One solution for everything
- Not just performance monitoring but also business reporting as well as deep dive diagnostics
- Active community f orum
- Get answers right away
- Leverage extensions already provided by the community such as plugins for Jenkins, PagerDuty, …
Let me give you some examples for Jan’s use case so that you can better decide on whether that is relevant for you as well:
Every Transaction with All Details
dynaTrace was built from the ground up to support the full software lifecycle. We at Dynatrace understood that we needed a technology that captures every transaction with all details for root cause diagnostics as well as proper business monitoring without falling into a sampling mode where you lose critical information for both business and root cause diagnostics. Most of our customers claim they see little to acceptable overhead in production yet capturing 100% transactions including method arguments, SQL Statements, Log Messages or Exceptions. The magic word in our case is our PurePath (see the YouTube video) & PureStack Technology which allows Dynatrace to do exactly that. One of the several visualization of the PurePath is the Transaction Flow which is a great way to understand how your transactions flow through the system – where your hotspots are (3rd party impact, custom code issues or impact of Garbage Collection) and where your architectural issues (e.g: too many web service calls, too many SQL executions):
What if you don’t capture all transactions but be “smart” and focus on capturing the problematic ones? While this approach allows you to find and fix the easy-to-find problems that can be analyzed by analyzing those transactions that fail or violate the average response-time based baseline, it falls short when it comes to problems that are caused by transactions that are not “outside the norm”. One example here is a database deadlock we recently analyzed for a customer. The “smart” approach only highlighted the transaction that hit the deadlock but no information was captured for those transactions actually causing the deadlock with their data manipulations. Being able to see which transactions executed which UPDATE statements at the time leading up to the deadlock is required to solve this problem.
As companies – such as Jan’s – are getting into a maturity level where they grow out of “smart” average response time-based analysis it is important to have the ability to look at everything and not just the average problem. As a follow up read the blog Why Averages Suck and Percentiles are great!
Capture Custom Business Context
What is Custom Business Context? The actual business function executed such as a “Create Claim”, “Transfer Money,” or the name of the user or tenant of your system. Why is this not as easy as it sounds? Because many applications just don’t show the business function as part of the URL or provide the user name in a cookie. A great example was given in a webinar by NJM Insurance (New Jersey Manufacturing Insurance). They were using a 3rd party claim management software which was designed to “hide” everything behind a claimCenter.do URL. In their case they needed Dynatrace to analyze every single transaction and pick a method argument invoked in the business layer of their app to figure out which function in their system was actually executed. On top of that they also needed to know the user that executed that function because they needed to understand which insurance office and group of employees created how many claims as they needed this for their quarterly business reports. The following shows business reporting based on the user role where the user role gets captured from a method argument within the business logic of the application:
This was only possible because Dynatrace allows you to selectively capture business context in the context of every single executed transaction. Along the PurePath you will then see things like method arguments, return values, bind values, session variables, HTTP parameters or cookie values. All to be later used for your business reporting or targeted root cause diagnostics. Here is a follow up blog post that explains business transactions in more technical detail.
Tool Consolidation & Integration
Why using separate tracing, CPU profilers, memory diagnostics, UML graphing, systems, and application and business tools if you can have it all in one tool? Dynatrace provides all this and more for developers, architects, testers, operations and business owners in a single solution centered on the core PurePath technology. It also comes with traditional CPU and Memory Leak Diagnostics (or watch Webinar on YouTube) features allowing you to take CPU and Memory Snapshots from any application server you want at any time with a simple button click. PurePath can not only be viewed in the nice Tree View that we provide but you can create a UML Diagram for every transaction which is often a great conversation starter in code and architectural review meetings.
As PurePath and PureStack in Production is all about end user, application and system monitoring, you can get rid of your traditional system monitoring tools as Dynatrace does just all of that as well – for free! And for business: having all this business context data allows your business owners to evaluate how well your software supports the business goals: Which users spend more or less money, where the users come from, where they drop out or which paths they take to convert to business – it’s classical conversion funnel tracking.
All of this data can either be viewed using the rich Dynatrace Client UI or can be integrated into external tools such as Eclipse, Visual Studio, Jenkins, Bamboo, SilkPerformer, Load Runner, Gomez, Microsoft SCOM, HP Open View, Et Cetera.
Collaboration across DevTOps
One key cornerstone of DevTOps (Dev, Test, Ops) is collaboration. Dynatrace answer to this is sharing captured data with a single click. Any data captured (a single transaction, a full load test, data from production, CPU or Memory Snapshots, …) can be exported into a Dynatrace Session File. This file can be attached to a JIRA Ticket, sent via Skype or email to anybody that needs to look at it, whether it is a colleague or a 3rd party vendor. The Dynatrace Client to view this data is 100% free and makes adoption of that collaboration even easier
Check out the Hunting an Oracle JDBC Memory Leak blog where one of our customers found a problem in Oracles JDBC Implementation crashing their 80 JVM cluster. They shared the session file with Oracle which they accepted as proof that it was their fault and provided a fix based on the information seen in the PurePath. Good News is that Oracle – even though they could afford it J – didn’t need to buy a Dynatrace Client to view the data because the Client is even free of charge for them.
We live in a world of frameworks, open source platforms and services that support our applications. This number is constantly growing. No APM vendor can support everything out-of-the-box but can provide extension mechanisms to build custom support for things that are currently not supported out-of-the-box. Dynatrace provides several extension mechanisms: Pulling in data from any external data source, exporting data to external systems and even getting PurePath capabilities for End-to-End tracing into technologies that are not supported in the moment via our Development Kits.
If you are really serious about Continuous Delivery and you read up on current literature you know that Automation is a key element to build quality into your processes and products. Every step that needs to be done manually is error prone and slows down the process. Automation for Dynatrace means that on the one side we provide REST Interfaces to query data from Dynatrace but also manage the Dynatrace Configuration. It can also be used to register builds that are currently running through your pipeline or you can use it to let Dynatrace know about a new production deployment you are about to do so that we can adapt the automatic baselining for the time period right after the deployment becomes active. The other automation element is that we provide an automated real time business transaction data feed into your external Analytics Solutions – whether it is Splunk (check out our Splunk Application) or other tools.
Replace Traditional Logging
I thought that this was a particularly interesting point that Jan mentioned. They constantly battled with log spamming. With Dynatrace they simply turned logging off because Dynatrace captures the relevant log messages in the context of the PurePath anyway. It also allowed them to optimize logging as they figured out that developers kept logging the same log messages multiple times per transaction and also logged information that was not relevant at all for that transaction. Showing a PurePath to developers was convincing enough for developers to change their implementation. Especially when you see the overhead associated with traditional logging as the following screenshot taken from the WebSphere causing 100% logging overhead in Production blog shows:
On the topic of logging you also want to check out the following blog post from Martin who claims that traditional logging is a dead-end-road.
With more than 80k we have a very large and active community. The heart of the community is the discussion forum where see hundreds of interactions every week between Dynatrace power users, newbies and the Dynatrace employees.
The community also provides a platform to exchange and share extensions to Dynatrace such as the recently added PagerDuty Action Plugin, an updated version of our JIRA Plugin or Plugins to monitor Locks and Execution Plans in Oracle. All there at your fingertips
Next Step: See for yourself
I will post more use cases of other customers that decided to use our product in upcoming blog posts. If you want to make up your own mind simply sign up for our Dynatrace Personal where you can evaluate the full product functionality. While downloading the trial, check out the 5 Minute Overview of Dynatrace on YouTube.