My observation from re:Invent 2017 – People should expect more from monitoring tools

I just returned from AWS re:Invent 2017 – wow, what an event!

Several monitoring vendors had impressive exhibits on display including us (Dynatrace), New Relic, DataDog, SignalFx, and others. I spent some time observing the demos these other vendors were showing to their booth visitors. With each demo that I watched, I repeatedly had the overwhelming urge to grab a microphone and announce…

…but since that behavior would’ve been frowned upon, I thought I would just write this blog instead.

Over the years, people have grown accustomed to how monitoring tools function and the value they provide. As a result, they unknowingly have low expectations and they accept ineffective capabilities as normal. For example:

Things that people accept as normal with their tools

  • People think that a monitoring tool will require a lot of effort and skill to configure, setup and use effectively.
  • People accept the fact that they’ll need several different tools to get a basic level of monitoring across all the different systems.
  • People believe that visibility comes from creating a bunch of dashboards and charts.
  • People understand they will have to deal with alert storms and false positive alerts.
  • People expect that finding the root cause of a problem will require correlating and drilling through tons of data points (and a fair amount of luck).

The monitoring tool demos I observed at AWS re:Invent continued to reinforce these same low expectations.

Demonstrator:  “Here’s how you can display the host CPU % on a chart…”
Audience:      Heads nodding, people clapping.

I was amazed by how many people think these old, tiresome approaches to monitoring are acceptable.

It’s time to realize this isn’t good enough. Stop the madness.

Next time you use a monitoring tool or see a demonstration – expect more. Ask questions like:

  • Why doesn’t this tool monitor the end user experience?
  • Why can’t I see the performance details of each user session and each user click?
  • Why does this tool require 10 different types of agents?
  • Why doesn’t this tool just tell me the root cause of a problem?
  • Why doesn’t this tool show me the business impact of a problem?
  • Why does this tool require me to have to modify the startup scripts for every single JVM to monitor the java applications? What isn’t this done automatically?
  • Why does this tool only have 2 new releases per year?
  • Why is this tool not usable by other people in the organization including non-technical people?
  • Why do I need multiple tools for APM, infrastructure, cloud, logs, and user monitoring?
  • If this tool can monitor a Docker container, why doesn’t it automatically monitor the application component within the container?
  • Why doesn’t the tool automatically upgrade itself?
  • Why can’t I choose to run the tool on-premise or use it via SaaS?
  • Why do I have to configure alerting rules? Isn’t the tool smart enough to recognize application and infrastructure problems by itself?
  • Why doesn’t the tool automatically trace all the transactions from end to end? Why do I have to insert additional code into my application to enable tracing and telemetry?
  • Why do I need yet another tool like Moogsoft or Big Panda to handle the alert notifications?
  • Why do I have to manually configure dozens of different “integrations” for the monitoring to work?

Expect more – monitoring redefined

If you expect more and want a monitoring solution that does not have these same old limitations, you should try Dynatrace and find out what monitoring redefined is all about.


Stay updated