Accelerate observability with Dynatrace Managed performance improvements

Dynatrace just makes observability easy—it works out-of-the-box, no silos of data, no DIY stitching together tools, no wasted time, and no wasted resources.”

Bernd Greifeneder, Dynatrace CTO

Dynatrace Managed provides all the power of a Dynatrace SaaS environment in an on-premises solution that delivers SaaS conveniences such as pro-active support for maintenance and licensing, auto updates, and easy troubleshooting by our support teams. With Dynatrace Managed Mission Control Support Services watching your deployment, our solution provides you with the highest value, resilience, and proactive support, which translates into the lowest cost of ownership. There’s no other competing software that can provide this level of value with minimum effort and optimal hardware utilization that can scale up to web-scale!

We’re continuously investing in performance optimizations, high availability, and resilience for Dynatrace Managed deployments. In this blog post, I’d like to share with you three major improvements that we’ve implemented recently that enable you to achieve more at no additional cost, helping you to gain new insights with higher fidelity PurePaths, monitor more entities, and automate more.

Support for high memory instances

I’d like to stress the lean approach to hardware that our customers require for running Dynatrace Managed. Wasted resources simply aren’t an option. Say, for example, that you have some unused high memory instances (1 TB RAM / 128 vCPU cores or higher). Before Dynatrace Managed 1.198, we had no option other than to set a threshold of 512 GB RAM to ensure optimum memory utilization. Now, Dynatrace Managed better utilizes high memory instances so you can leverage the hardware you already have!

Optimal metric storage management strategy

Dynatrace Managed metric storage management is reliable and delivers high performance. Unlike our competition, Dynatrace Managed processes all the data sent by OneAgent without sampling. However, metric storage requires a high 50% disk space reservation and large chunks of data (called SSTables) that can remain stuck on disk well beyond the retention time (for example, the retention time for 1 min resolution of time series data is 14 days), which results in wasted disk space. Additionally, reclamation of such wasted disk space often doesn’t occur for weeks if at all, which risks depletion of available disk space.

Our new solution for managing metric storage doesn’t reclaim disk space by data compaction. Rather, it removes “fully expired” SSTables entirely from disk via a simple file system deletion, which is comparable to partitioning in relational database management systems (DBMS). The solution leverages Cassandra’s Time-window Compaction Strategy which results in predictable, constant and efficient disk utilization. What’s more, it has a positive impact on data-read performance and lower utilization of CPU.

Impact on disk space

Starting with Dynatrace Managed version 1.200, cluster nodes automatically use the new disk space management strategy. Before a cluster can fully benefit from this strategy however, new data must replace the existing data. During the transition period, higher disk space usage is expected (until 1-minute interval time series data is captured and stored for 14 days). We’ve already reached out pro-actively to all our customers who run less then 50% free disk space, so that they can adjust to this change and continue running successfully.

Take a look below to see the positive impact on reduced disk space in one of our test environments (transition period set to 7 days). This is a 50% reduction in required disk space! Typically, you can expect 30-40% in disk space savings.

Increased processing power with the update to JRE 11

With Dynatrace Managed version 1.202 (planned for October 2020) we will release additional performance improvements for some components based on the update to Java Runtime Environments (JRE) 11. JRE 11 takes advantage of performance boosts, the latest security-vulnerability enhancements, and some bug fixes. All affected services will be automatically restarted during the upgrade. No downtime is expected during the upgrade for clusters that have 3 or more nodes.

Since Dynatrace Managed version 1.192, the major trigger for Adaptive Load Reduction (ALR) is cluster node health, and the key performance metric is garbage collection suspension time. I wrote about this in the blog post Process more with less using smarter cluster overload prevention for Dynatrace Managed. One of the biggest improvements in the most recent Java release is more effective Garbage Collection management that leads to shorter suspension time. Our new ALR algorithm relies on how effectively Java deals with memory. That’s why the update to JRE 11 significantly improves processing power of Dynatrace Managed clusters and allows you to monitor more service entities and process even more transactions (PurePaths) using the same hardware.

If you experience ALR kicking-in from time to time it means that your cluster is running at capacity. The upgrade to Dynatrace Managed version 1.202 will provide you with more space for operations. Take a look at the chart below where you can see one of our customer’s cluster nodes. Early on, the cluster node had to reject about 30% of PurePaths (shown in turquoise). Following the upgrade, the amount of rejected data was near zero!

Have questions about these features?

Your input matters. Please share your feedback with us by posting your questions and clarifications in the Dynatrace Open Q&A forum.

Stay updated