Can a customer really be happy if they don’t use your product the way you intended? This is the story of my humbling experience with customers I never expected to be happy.
I’m responsible for Customer Success at ruxit. My job involves education, support, consulting, and sometimes just being in the wrong spot at the right time. It’s all good though, because I get to see the impact of what the rest of the team creates on customers.
ruxit is a comprehensive Application Performance Management (APM) solution. The APM part means that it watches complex software, figures out which components ask for what resources, how long it takes to get responses, and what resources are used. The comprehensive part is that it combines a bunch of functionality you’d normally need to buy as 4 different tools: Real User Monitoring, Application Monitoring, Server and Network Monitoring, and a dashboard product. In spite of all these functions, ruxit installs quickly, and automatically discovers and displays complex infrastructure, like this:
Each one of those circles is a component (Application, Service, Process or Host) in our customer’s rapidly changing environment. By knowing how everything interacts, ruxit shows causal impact on performance. So when it shows a problem, it shows you not only what the root cause is, but what impact the problem had. Like this:
So as I mentioned, I’m responsible for Customer Success. We have a process where we reach out for feedback, not only to ensure the relationship, but also to incorporate customer ideas into the product. Before I do reach out, I browse the customer environment (we’re SaaS based), and try and get a feel for what they use the product for.
BARBRI was one of the first customers to choose ruxit. They had been a customer for the past 4 months, and were great to work with. We shared their vision of a single APM solution, and with ruxit they took a new approach to APM.
My dilemma was that when I went through their environment, I could not see anything “cool”. We had all sorts of nifty widgets and end user response measurements, slick instrumentation and dashboards set up. And none of it “spoke to me”. It all worked well, but I procrastinated making the phone call because I didn’t have a nugget in my back pocket to prove how awesome the product was. I had to get mentally prepared for the dreaded, “I don’t have any problems, but ruxit does not show me any value.”
So one day I put my big boy pants on and called Greg at BARBRI, hoping for the best, and expecting the worst:
Mike : “Hi Greg, it’s Mike from ruxit [yada yada yada], can you describe the value you get from ruxit, or tell us what we can be doing better?” I duck and cover…
Greg: “Thanks for the call Mike. We love it. Ruxit has certainly shown us problems we never could have found on our own.
Wait, what? Your end users are fine. Your system runs well. You haven’t used our degradation analysis or flux capacitor yet!!
Greg’s answer was a much needed reminder that we’re solving his problems, not delivering functionality. This unified APM tool pointed out configuration errors and hardware issues that led to intermittent problems that were impossible to reproduce and showed up at the wrong time. It wasn’t rocket science, but simply showing that what needed to be fixed, but was clouded by complexity.
Example #1 – Networks are used for lots of stuff
There had been lingering issues about file copy times for some virtualized environments. For the most part the server performed well, but file copies took a long time, and were hard to troubleshoot.
Once ruxit was installed, it started flagging a problem for these symptoms immediately. ruxit discovered a High dropped packets rate problem in those environments:
Most of us don’t immediately relate network performance with I/O performance. These dropped packets were impacting NFS mounts in particular, and slowing down file transfers.
This information allows BARBRI to push their virtualization vendor with quantifiable metrics, for patches or configuration changes to eliminate the issues. Quantifiable root cause and impact of network performance on user function.
Example #2 – Physical ESXi hosts had hardware issues
Even though BARBRI makes use of virtualization, they also host their own hardware. They keep the hardware builds consistent within the cluster.
In one virtualization cluster, ruxit found a hardware issue:
To me, looking at this issue from the cloud, I didn’t connect the dots. Since Greg had a better grip on the environment, he knows that all his hardware was consistent and something was not working well. He used the notification above and the data below to hone in. Can you see which server stood out from the others in the image below?
Greg described the impact as follows, “Certain file copy functions like rsync were taking longer than expected to finish, but the behavior was not consistent across all servers. The inconsistent behavior made finding the issue difficult, and since it was usually not repeatable it was treated as a temporary network issue. ruxit detected the issue immediately and let BARBRI focus in on resolving the problem.”
The moral of the story
In this case, I’d been telling my own story too much, and not listening enough. In my 15 years in APM, I’d pigeon holed myself into thinking of performance as page loads, response times, and service calls. Who knew Application centric products could help with infrastructure impact! Other products I’d serviced, and those we competed with didn’t give the whole picture, so I re-defined the problem to suit my purpose, not my customer’s.
Plus, I remembered how much you can learn by picking up the phone, especially in these days of “digital sales and service”. Greg and the BARBRI team gave great context about their problems, and the impact ruxit had. So yes, even when a customer uses your product in ways you didn’t anticipate, they can still be happy.