Effective communication between people has and will always be a challenge. There is always a chance for misunderstandings or talking in circles until you get to the answers you expect.
With more AI (Artificial Intelligence) entering our lives (both in the personal and in the enterprise space) we need to make sure that we are not repeating the same issues. An AI (Voice, Chat, Augmented Reality …) should really make our lives easier by:
- Notifying and advising us, e.g: Take an alternative route due to a bad traffic jam!
- Automating tasks, e.g: Please order a taxi so I am arriving at my friends on-time!
When Dynatrace came up with Davis® – our deterministic AI – we didn’t just want it to be better in anomaly detection, automated root cause or business impact analysis. We wanted Davis to be an AI that supports the lives of our users in similar ways. To give you just two examples:
- Pro-actively alerting on problems before your users start complaining
- Get instant answers to business & technical questions without having to ask your colleagues
While our users can already interact with Dynatrace Davis through our Voice and Chat Assistant, the Web interface or the Automation API, we keep getting feedback from our users with ideas on how to improve – let’s say the “soft communication skills” – of Davis when it communicates with us humans!
Today’s example comes from a Dynatrace customer at a top US-based insurance company. As with the previous blogs I wrote based on his input, he started the discussion with an email saying: “At my company, we are using Dynatrace Davis for full-stack monitoring. We recently rolled out the Dynatrace Mobile App to automate problem notification as well as giving more transparency to our management as they want to know what is going on and who is working on which problems. The only challenge we have is that sometimes Davis gives us too generic information about an issue, and it takes additional communication cycles to get all the information we need to work on the problem. We started “teaching” (through configuration) Davis to be more specific in its findings. This significantly improved the conversation between Davis and us humans. It also increased the adoption of Davis by our management team who are now getting status information directly from Davis vs having to call different team members. Davis clearly makes our lives easier! Andi, let me share same details as this might be useful for others out there as well!”
Problem: The “Too Generic” Interaction
By default, Dynatrace detects process groups & services through a generic algorithm. Dynatrace looks at meta data like technology type (Java, .NET, PHP, …), the platform (Tomcat, Kubernetes, OpenShift, …), container details (image, tags, …) as well as class names, service names, … in order to come up with a meaningful name to display to the user. At a large US-based insurance provider, this approached works for most processes and services – but is too generic for some.
The following is an example of a Dynatrace Davis detected Response Time Problem for the GenerateXML service – impacting 60+ Requests / Min:
While the information on the problem card is great it leaves some open questions:
- What is GenerateXML?
- Where is this service running?
- Which application server is hosting it?
- Which host is causing this?
The answers to these questions can easily be retrieved by opening the problem details in Dynatrace. But – why force users into additional clicks? Can’t Dynatrace be made “smarter” and give all the answers right away that humans need to work on this? The answer is YES!
Solution: Service & Process Group Naming Rules
Dynatrace allows you to define custom rules for process group and service naming. When onboarding new services into their infrastructure, the customer and his team “teach” Davis how to come up with more meaningful names which results in much clearer initial communication. The following screenshot shows a similar problem ticket as above – but now with all the context information a human need to immediately react to it:
The new naming not only benefits analyzing problems in the Dynatrace UI, it also benefits when integrating Dynatrace into the Incident Notification and Resolution Workflow. Dynatrace provides several problem notification integrations with tools such as ServiceNow, xMatters, PagerDuty, VictorOps, OpsGenie, JIRA, Slack, custom webhooks or others.
At a large US-based insurance company, the team decided to use the custom webhook integration to push problems into their Help Desk Platform. The team also started rolling out the Dynatrace Mobile App for both their On-Call Teams as well as for their managers. Here is why and how it leads to more transparency!
Benefit: Transparency to Management through Dynatrace Mobile App
Management is always interested in status. At this company, it is no different. Prior to Dynatrace, the Management team typically picked up the phone to call the On-Call team and ask for status on current open issues. This was extra overhead for the On-Call team who were busy with fixing issues. The customer solved this additional overhead and increased automated transparency for management by equipping On-Call teams as well as Management with the Dynatrace Mobile App.
When Dynatrace Davis detects a problem, it first creates a ticket in their Help Desk Platform through the Custom Webhook Integration. Additionally, the On-Call staff as well as Management will receive a notification on their mobile phone as shown on the following screen:
The management team can immediately get more information about the impact of the current problem without having to reach out to the on-call team. Following screenshot shows a sample:
The On-Call team can immediately start drilling into the problem details. They can also use the comment feature to keep everyone up-to-date on execution action and status:
Commenting on problem tickets is a great feature. It can be done through the mobile app, the web interface but also through the Dynatrace Problem API. The API allows you to work on problems in your own tools, e.g: Help Desk Management Tools, JIRA, ServiceNow, … and push any actions taken as a comment to the Dynatrace Problem. The API also opens the doors to push information from auto-remediation workflows to Dynatrace.
As the following screenshot shows: all relevant information is stored on the problem and accessible by everyone. This eliminates any “What’s the status on this problem?” phone calls or slack messages:
Pro Tip #1: How to Define Custom Naming Rules
The customer was kind enough to share more technical details on the custom naming rules that they are using. All these configuration settings are well documented and can also be configured through the Dynatrace REST API.
Here is where you find the configuration settings in the UI:
The following shows how the customer is configuring the Process Group Naming Rules for IBM WebSphere. The process group will contain the Application Server Type Name (WebSphere), the cell name and the VMWare host name:
If you want to dive deeper – check out these resources
- Blog: Custom Process Group Naming for large environments
- Doc: Customize process group names
- Doc: Define custom services
- Doc: Service detection and naming
Pro Tip #2: Setting up Problem Notifications
Make sure to setup the available problem notifications. The list is long (Slack, ServiceNow, JIRA, xMatters, PagerDuty, VictorOps, OpsGenie, …) but also allows you to push notifications to your own endpoints. Here is one of my blogs that explain how to build your own custom Dynatrace problem notification handler.
Pro Tip #3: Emojis are supported
I just leave you with this screenshot for this tip 😊!
Conclusion: Humans can benefit from AI
Thanks to the customer (a human), for showing us how we can leverage Dynatrace Davis to make our lives easier. Great to see how companies like this one are leveraging this new technology, optimize it to their needs and with that provide a better working environment for their employees!