Managed hardware requirements

This topic explains the hardware for installing Dynatrace Managed. For other Dynatrace Managed requirements, see Managed system requirements and Managed hardware recommendations for cloud deployments.

Sizing considerations

Sizing generally consists of these elements:

Be sure to consider each element before proceeding.

General planning

It's not always possible to provision nodes that are sized exactly right, particularly if your environment is subject to ever-increasing traffic levels. While it's useful to do upfront analysis of required size, it's more important to have the ability to add more capacity to your Dynatrace Managed cluster should your monitoring needs increase in the future. To leverage the full benefits of the Dynatrace Managed architecture, be prepared to scale along the following dimensions:

  • Horizontally by adding more nodes. We support installations of up to 30 cluster nodes.
  • Vertically by provisioning more RAM/CPU per node.
  • In terms of data storage, by being able to resize the disk volumes as required (for guidelines regarding recommended disk setup see below).

For cloud deployments, use the recommended virtual machine equivalents for Managed hardware recommendations for cloud deployments

Hardware requirements

The hardware requirements included in the following table are estimates based on typical environments and load patterns. Requirements for individual environments may vary. Estimates for specific columns take into account the following:

  • Minimum node specifications
    CPU and RAM must be exclusively available for Dynatrace. Power saving mode for CPUs must be disabled. CPUs must run with a clock speed of at least 2GHz and the host should have at least 32GB of RAM assigned to it.

  • Transaction Storage
    Transaction data is distributed across all nodes and isn't stored redundantly. In multi-node clusters, transaction data storage is divided by the number of nodes.

  • Long-term Metrics Store
    For multi-node installations, three copies of the metrics store are saved. For four or more nodes, the storage requirement per node is reduced.

    You should treat the 4 TB requirement for the XLarge node as the maximum acceptable size. If you need more capacity, consider adding another node. Plan your long-term metrics store for data being a maximum of 50% of your available disk space. In these terms, 4 TB of space would handle 2 TB of your long-term metrics store data. While stores larger than 4 TB are possible, they can make database maintenance problematic.

Dynatrace Managed

Node Type Max hosts
monitored
(per node)
Peak user
actions/min
(per node)
Min node
specifications
Disk IOPS
(per node)
Transaction Storage
(10 days code visibility)
Long-term
Metrics Store

(per node)
Elasticsearch
(per node)
(35 days retention)
Micro 50 1000 4 vCPUs,
32 GB RAM
30 50 GB 100 GB 50 GB
Small 300 10000 8 vCPUs,
64 GB RAM
100 300 GB 500 GB 500 GB
Medium 600 25000 16 vCPUs,
128 GB RAM
300 600 GB 1 TB 1.5 TB
Large 1250 50000 32 vCPUs,
256 GB RAM
750 1 TB 2 TB 1.5 TB
XLarge 1 2500 100000 64 vCPUs,
512 GB RAM
1500 2 TB 4 TB 3 TB

1 While Dynatrace Managed runs resiliently on instances with 1 TB+ RAM/128 cores (2XLarge) and allows you to monitor more entities, it's not the optimal way of utilizing the hardware. Instead, we recommend that you use smaller instances (Large or XLarge).

Examples

  • To monitor up to 7,500 hosts with a peak load of 300,000 user actions per minute, you need 3 extra large (XLarge) nodes with a storage of 9 TB each split respectively to storage types.

  • To monitor 500 hosts with a peak load of 30,000 user actions per minute, you need 3 small nodes with 1.3 TB storage each split respectively to storage types. Alternatively, you can also use 1 medium node with a storage of 2.1 TB.
    We recommend a failover set up of minimum 3 nodes instead of single nodes that are less resilient.

Dynatrace Managed Premium High Availability

Node Type Max hosts
monitored
(per node)
Peak user
actions/min
(per node)
Min node
specifications
Disk IOPS
(per node)
Transaction Storage
(10 days code visibility)
Long-term
Metrics Store

(per node)
Elasticsearch
(per node)
(35 days retention)
Large 600 25000 32 vCPUs,
256 GB RAM
750 1 TB 2 TB 1.5 TB
XLarge 1 1250 50000 64 vCPUs,
512 GB RAM
1500 2 TB 4 TB 3 TB

1 While Dynatrace Managed runs resiliently on instances with 1 TB+ RAM/128 cores (2XLarge) and allows you to monitor more entities, it's not the optimal way of utilizing the hardware. Instead, we recommend that you use smaller instances (Large or XLarge).

Example

To monitor 7,500 hosts with a peak load of 300,000 user actions per minute in the Premium High Availability deployment, you need 6 extra large (XLarge) nodes - 3 nodes in one data center and 3 nodes in second data center, each node with a storage of 9 TB split respectively to storage types.

Storage recommendations

Dynatrace Managed stores multiple types of monitoring data, depending on the use case.

We recommend:

  • Storing Dynatrace binaries and the data store on separate mount points to allow the data store to be resized independently.
  • Not keeping Dynatrace data storage on the root volume to avoid additional complexity when resizing the disk later, if required.
  • Mounting different types of data storage on separate disk volumes for maximum flexibility and performance.
  • Creating resizable disk partitions (for example, by leveraging Logical Volume Manager [LVM]).
OneAgent opt-out

OneAgent self-monitoring is enabled by default. An opt-out installation parameter is available:

--install-agent <on|off>

Supported file systems

Dynatrace Managed operates on all common file systems. We recommend that you select fast local storage appropriate for database workloads. High latency remote volumes like NFS or CIFS aren't recommended. While NFS file systems are sufficient for backup purposes, we don't recommend them for primary storage.

Amazon Elastic File System

We don't support or recommend Amazon Elastic File System (EFS) as a main storage for Elasticsearch. Such file systems don't offer the behavior that Elasticsearch requires, and this may lead to index corruption.

Log Monitoring v2 recommendations

For Log Monitoring v2, we recommend the following:

  • For a more robust configuration, it's better to add more cluster nodes than to increase hardware on each node.
  • Distribute additional Elasticsearch storage equally across cluster nodes.
  • Add CPUs and RAM to existing cluster nodes such that nodes remain equally sized.
  • For each one hundred million (100,000,000) log events per day (70,000 events per minute on average and 140,000 events per minute during peaks) on your cluster, add additional resources distributed across all cluster nodes:
    • 5 CPU cores
    • 3.3 TB disk (holding 2 replicas of data for high availability)
    • 6.5 GB RAM

For example, to handle three hundred million (300,000,000) log events per day you will need an additional 15 CPUs, 9.9 TB disk, and 19.5 GB RAM. On an existing cluster of three medium-size nodes, you could do one of the following:

  • recommended Add two additional medium-size nodes to form a five-node cluster. Additionally, extend the Elasticsearh storage on each node by 2 TB (in which case, each node stores 40% of the doubly replicated data).
  • Add 3.3 TB disk storage (each node stores 67% of doubly replicated data), 8 CPUs, and 16 GB RAM per node.

The following recommendations are based on approximate log events per day:

Log events per day Additional CPUs Additional disk space Additional RAM
One hundred million (100,000,000) 5 3.3 TB 6.5 GB
Three hundred million (300,000,000) 15 9.9 TB 19.5 GB
Five hundred million (500,000,000) 25 16.5 TB 32 GB

In the case of larger or richer events, or if you exceed one billion (1,000,000,000) log events per day, please contact support for advice on sizing.

Note:

  • These recommendations are in addition to any requirements from other traffic sources.
  • Log events are stored in the Elasticsearch storage.
  • Log events are stored with a replication factor of 2.
  • Keep in mind that the retention time for log events is 35 days.

Multi-node installations

We recommend multi-node setups for failover and data redundancy. A sufficiently sized 3-node cluster is the recommended setup. For Dynatrace Managed installations with more than one node, all nodes must:

  • Have the same hardware configuration
  • Be synchronized with NTP
  • Be in the same time zone
  • Be able to communicate over a private network on multiple ports
  • The latency between nodes should be around 10 ms or less
  • System users created for Dynatrace Managed must have the same UID:GID identifiers on all nodes
Avoid split-brain sync problems

While two node clusters are technically possible, we don't recommend it. Our storage systems are consensus-based and require majority for data consistency. That's why two node cluster is vulnerable to split-brain problem and should be treated as a temporary state when migrating to 3 or more nodes. Running two nodes may create availability or data inconsistencies from two separate data sets (single node clusters) that overlap and are not communicating and synchronizing their data with each other.