More Effective AI to Human Interactions with Dynatrace DAVIS at NYCM

Effective communication between people has and will always be a challenge. There is always a chance for misunderstandings or talking in circles until you get to the answers you expect.

With more AI (Artificial Intelligence) entering our lives (both in the personal and in the enterprise space) we need to make sure that we are not repeating the same issues. An AI (Voice, Chat, Augmented Reality …) should really make our lives easier by:

  • Notifying and advising us, e.g: Take an alternative route due to a bad traffic jam!
  • Automating tasks, e.g: Please order a taxi so I am arriving at my friends on-time!

When Dynatrace came up with DAVIS – our deterministic AI – we didn’t just want it to be better in anomaly detection, automated root cause or business impact analysis. We wanted DAVIS to be an AI that supports the lives of our users in similar ways. To give you just two examples:

While our users can already interact with Dynatrace DAVIS through our Voice and Chat Assistant, the Web interface or the Automation API, we keep getting feedback from our users with ideas on how to improve – let’s say the “soft communication skills” – of DAVIS when it communicates with us humans!

Today’s example comes from Chad Turner, Dynatrace Certified Associate Network Systems Technician at NYCM. As with the previous blogs I wrote based on his input, he started the discussion with an email saying: “At NYCM we are using Dynatrace DAVIS for full-stack monitoring. We recently rolled out the Dynatrace Mobile App to automate problem notification as well as giving more transparency to our management as they want to know what is going on and who is working on which problems. The only challenge we have is that sometimes DAVIS gives us too generic information about an issue, and it takes additional communication cycles to get all the information we need to work on the problem. We started “teaching” (through configuration) DAVIS to be more specific in its findings. This significantly improved the conversation between DAVIS and us humans. It also increased the adoption of DAVIS by our management team who are now getting status information directly from DAVIS vs having to call different team members. DAVIS clearly makes our lives easier! Andi, let me share same details as this might be useful for others out there as well!”

Problem: The “Too Generic” Interaction

By default, Dynatrace detects process groups & services through a generic algorithm. Dynatrace looks at meta data like technology type (Java, .NET, PHP, …), the platform (Tomcat, Kubernetes, OpenShift, …), container details (image, tags, …) as well as class names, service names, … in order to come up with a meaningful name to display to the user. At NYCM this approached works for most processes and services – but is too generic for some.

The following is an example of a Dynatrace DAVIS detected Response Time Problem for the GenerateXML service – impacting 60+ Requests / Min:

Great detection but too generic service name. A human would ask: What host is running GenerateXML?
Great detection but too generic service name. A human would ask: What host is running GenerateXML?

While the information on the problem card is great it leaves some open questions:

  • What is GenerateXML?
  • Where is this service running?
  • Which application server is hosting it?
  • Which host is causing this?

The answers to these questions can easily be retrieved by opening the problem details in Dynatrace. But – why force users into additional clicks? Can’t Dynatrace be made “smarter” and give all the answers right away that humans need to work on this? The answer is YES!

Solution: Service & Process Group Naming Rules

Dynatrace allows you to define custom rules for process group and service naming. When onboarding new services into their infrastructure, Chad & team “teach” DAVIS how to come up with more meaningful names which results in much clearer initial communication. The following screenshot shows a similar problem ticket as above – but now with all the context information a human need to immediately react to it:

Optimized communication: all relevant problem context data now part of the problem ticket
Optimized communication: all relevant problem context data now part of the problem ticket

The new naming not only benefits analyzing problems in the Dynatrace UI, it also benefits when integrating Dynatrace into the Incident Notification and Resolution Workflow. Dynatrace provides several problem notification integrations with tools such as ServiceNow, xMatters, PagerDuty, VictorOps, OpsGenie, JIRA, Slack, custom webhooks or others.

At NYCM they decided to use the custom webhook integration to push problems into their Help Desk Platform. NYCM also started rolling out the Dynatrace Mobile App for both their On-Call Teams as well as for their managers. Here is why and how it leads to more transparency!

Benefit: Transparency to Management through Dynatrace Mobile App

Management is always interested in status. At NYCM it is no different. Prior to Dynatrace, the Management team typically picked up the phone to call the On-Call team and ask for status on current open issues. This was extra overhead for the On-Call team who were busy with fixing issues. NYCM solved this additional overhead and increased automated transparency for management by equipping On-Call teams as well as Management with the Dynatrace Mobile App.

When Dynatrace DAVIS detects a problem, it first creates a ticket in their Help Desk Platform through the Custom Webhook Integration. Additionally, the On-Call staff as well as Management will receive a notification on their mobile phone as shown on the following screen:

Dynatrace DAVIS can alert On-Call staff and update management through the Dynatrace Mobile App
Dynatrace DAVIS can alert On-Call staff and update management through the Dynatrace Mobile App

The management team can immediately get more information about the impact of the current problem without having to reach out to the on-call team. Following screenshot shows a sample:

Details in the Mobile App allows management to understand current problem impact
Details in the Mobile App allows management to understand current problem impact

The On-Call team can immediately start drilling into the problem details. They can also use the comment feature to keep everyone up-to-date on execution action and status:

The comment feature allows the On-Call team to update status. Either directly in the mobile app, through the webui or even through 3rd party tools via the Dynatrace Problem API
The comment feature allows the On-Call team to update status. Either directly in the mobile app, through the webui or even through 3rd party tools via the Dynatrace Problem API

Commenting on problem tickets is a great feature. It can be done through the mobile app, the web interface but also through the Dynatrace Problem API. The API allows you to work on problems in your own tools, e.g: Help Desk Management Tools, JIRA, ServiceNow, … and push any actions taken as a comment to the Dynatrace Problem. The API also opens the doors to push information from auto-remediation workflows to Dynatrace.

As the following screenshot shows: all relevant information is stored on the problem and accessible by everyone. This eliminates any “What’s the status on this problem?” phone calls or slack messages:

The comment enriched problem ticket eliminates unnecessary status query calls as all status updates are here!
The comment enriched problem ticket eliminates unnecessary status query calls as all status updates are here!

Pro Tip #1: How to Define Custom Naming Rules

Chad was kind enough to share more technical details on the custom naming rules that they are using. All these configuration settings are well documented and can also be configured through the Dynatrace REST API.

Here is where you find the configuration settings in the UI:

Find the configuration sections for custom service and process group names in the Settings menu
Find the configuration sections for custom service and process group names in the Settings menu

The following shows how Chad is configuring the Process Group Naming Rules for IBM WebSphere. The process group will contain the Application Server Type Name (WebSphere), the cell name and the VMWare host name:

Naming Rules should always be specified with conditions. The name format also gives you placeholders and some regex capability!
Naming Rules should always be specified with conditions. The name format also gives you placeholders and some regex capability!

If you want to dive deeper – check out these resources

Pro Tip #2: Setting up Problem Notifications

Make sure to setup the available problem notifications. The list is long (Slack, ServiceNow, JIRA, xMatters, PagerDuty, VictorOps, OpsGenie, …) but also allows you to push notifications to your own endpoints. Here is one of my blogs that explain how to build your own custom Dynatrace problem notification handler.

Also make sure to look into the Dynatrace Mobile App the Davis Skills for Alexa, Google Assistant, Slack or Chrome.

Pro Tip #3: Emojis are supported

I just leave you with this screenshot for this tip 😊!

Emojis make things easier – hence they are supported 😊
Emojis make things easier – hence they are supported 😊

Conclusion: Humans can benefit from AI

Thanks to Chad (a human), for showing us how we can leverage Dynatrace DAVIS to make our lives easier. Great to see how companies like NYCM are leveraging this new technology, optimize it to their needs and with that provide a better working environment for their employees!

Stay updated