Amazon Elasticsearch Service (ES)

Dynatrace ingests metrics for multiple preselected namespaces, including Amazon Elasticsearch Service (ES). You can view metrics for each service instance, split metrics into multiple dimensions, and create custom charts that you can pin to your dashboards.

Prerequisites

To enable monitoring for this service, you need

  • An Environment or Cluster ActiveGate version 1.181+
  • Dynatrace version 1.182+
  • An updated AWS monitoring policy to include the additional AWS services.
    To update the AWS IAM policy, use the JSON below, containing the monitoring policy (permissions) for all supporting services.

If you don't want to add permissions to all services, and just select permissions for certain services, consult the table below. The table contains a set of permissions that are required for all services (All monitored Amazon services) and, for each supporting service, a list of optional permissions specific to that service.

Example of JSON policy for one single service.

In this example, from the complete list of permissions you need to select

  • "apigateway:GET" for Amazon API Gateway
  • "cloudwatch:GetMetricData", "cloudwatch:GetMetricStatistics", "cloudwatch:ListMetrics", "sts:GetCallerIdentity", "tag:GetResources", "tag:GetTagKeys", and "ec2:DescribeAvailabilityZones" for All monitored Amazon services.

Add the service to monitoring

In order to view the service metrics, you must add the service to monitoring in your Dynatrace environment.

Note: Once AWS supporting services are added to monitoring, you might have to wait 15-20 minutes before the metric values are displayed.

Configure service metrics

Once you add a service, Dynatrace starts automatically collecting a suite of metrics for this particular service. These are recommended metrics. Apart from the recommended metrics, most services have the possibility of enabling optional metrics. You can remove or edit any of the existing metrics or any of their dimensions, where there are multiple dimensions available. Metrics consisting of only one dimension can't be edited. They can only be removed or added.

Service-wide metrics are metrics for the whole service across all regions. Typically, these metrics include dimensions containing Region in their name. If selected, these metrics are displayed on a separate chart when viewing your AWS deployment in Dynatrace. Keep in mind that available dimensions differ among services.

To change a metric's statistics, you have to recreate that metric by choosing different statistics. You can choose among the following statistics: Sum, Minimum, Maximum, Average, and Sample count. The Average + Minimum + Maximum statistics enable you to collect all three statistics as one metric instead of one statistic for three metrics separately. This can reduce your expenses for retrieving metrics from your AWS deployment.

To be able to save a newly added metric, you need to select at least one statistic and one dimension.

Note: Once AWS supporting services are configured, you might have to wait 15-20 minutes before the metric values are displayed.

View service metrics

Once you add the service to monitoring, you can view the service metrics in your Dynatrace environment either on your dashboard page or on the custom device overview page.

Available metrics

Name Description Unit Statistics Dimensions Recommended
AutomatedSnapshotFailure The number of failed automated snapshots for the cluster. A value of 1 indicates that no automated snapshot of the domain has been taken for the last 36 hours. Count Minimum DomainName, ClientId
AutomatedSnapshotFailure Count Maximum DomainName, ClientId
CPUCreditBalance The remaining CPU credits available for data nodes in the cluster. A CPU credit provides the performance of a full CPU core for one minute. This metric is available only for the T2 instance types. Count Minimum DomainName, ClientId
CPUUtilization The percentage of CPU usage for data nodes in the cluster. Maximum shows the node with the highest CPU usage. Average represents all nodes in the cluster. This metric is also available for individual nodes. Percent Maximum DomainName, ClientId ✔️
CPUUtilization Percent Average DomainName, ClientId ✔️
ClusterIndexWritesBlocked Indicates whether your cluster is accepting or blocking incoming write requests. A value of 0 means that the cluster is accepting requests. A value of 1 means that the cluster is blocking requests. Count Maximum DomainName, ClientId
ClusterStatus.green A value of 1 indicates that all index shards are allocated to nodes in the cluster Count Minimum DomainName, ClientId ✔️
ClusterStatus.green Count Maximum DomainName, ClientId
ClusterStatus.red A value of 1 indicates that the primary and replica shards for at least one index aren't allocated to nodes in the cluster Count Minimum DomainName, ClientId
ClusterStatus.red Count Maximum DomainName, ClientId ✔️
ClusterStatus.yellow A value of 1 indicates that the primary shards for all indices are allocated to nodes in the cluster, but replica shards for at least one index aren't allocated to nodes in the cluster Count Minimum DomainName, ClientId
ClusterStatus.yellow Count Maximum DomainName, ClientId ✔️
ClusterUsedSpace The total used space for the cluster Megabytes Minimum DomainName, ClientId
ClusterUsedSpace Megabytes Maximum DomainName, ClientId
DeletedDocuments The total number of documents marked for deletion across all data nodes in the cluster. These documents no longer appear in search results, but Elasticsearch only removes deleted documents from disk during segment merges. This metric increases after delete requests and decreases after segment merges. Count Multi DomainName, ClientId
DiskQueueDepth The number of pending input and output (I/O) requests for an EBS volume Count Multi DomainName, ClientId
ElasticsearchRequests The number of requests made to the Elasticsearch cluster Count Sum DomainName, ClientId ✔️
FreeStorageSpace The free space for data nodes in the cluster Megabytes Multi DomainName, ClientId ✔️
FreeStorageSpace Megabytes Sum DomainName, ClientId
InvalidHostHeaderRequests The number of HTTP requests made to the Elasticsearch cluster that included an invalid (or missing) host header Count Sum DomainName, ClientId
JVMMemoryPressure The maximum percentage of the Java heap used for all data nodes in the cluster Percent Maximum DomainName, ClientId
KMSKeyError A value of 1 indicates that the KMS customer master key used to encrypt data at rest has been disabled Count Minimum DomainName, ClientId
KMSKeyError Count Maximum DomainName, ClientId
KMSKeyInaccessible A value of 1 indicates that the KMS customer master key used to encrypt data at rest has been deleted or revoked its grants to Amazon ES Count Minimum DomainName, ClientId
KMSKeyInaccessible Count Maximum DomainName, ClientId
KibanaHealthyNodes A health check for Kibana. A value of 1 indicates normal behavior. A value of 0 indicates that Kibana is inaccessible. In most cases, the health of Kibana mirrors the health of the cluster. Count Minimum DomainName, ClientId
MasterCPUCreditBalance The remaining CPU credits available for dedicated master nodes in the cluster. A CPU credit provides the performance of a full CPU core for one minute. Count Minimum DomainName, ClientId
MasterCPUUtilization The maximum percentage of CPU resources used by the dedicated master nodes Percent Average DomainName, ClientId ✔️
MasterJVMMemoryPressure The maximum percentage of the Java heap used for all dedicated master nodes in the cluster Percent Maximum DomainName, ClientId
MasterReachableFromNode A health check for MasterNotDiscovered exceptions. A value of 1 indicates normal behavior. A value of 0 indicates that `/_cluster/health/`` is failing. Count Minimum DomainName, ClientId
Nodes The number of nodes in the Amazon ES cluster, including dedicated master nodes and UltraWarm nodes Count Multi DomainName, ClientId ✔️
ReadIOPS The number of input and output (I/O) operations per second for read operations on EBS volumes Count/Second Multi DomainName, ClientId
ReadLatency The latency for read operations on EBS volumes Seconds Multi DomainName, ClientId
ReadThroughput The throughput for read operations on EBS volumes Bytes/Second Multi DomainName, ClientId
RequestCount The number of requests made to the Elasticsearch cluster Count Sum DomainName, ClientId
SearchableDocuments The total number of searchable documents across all data nodes in the cluster Count Multi DomainName, ClientId
WriteIOPS The number of input and output (I/O) operations per second for write operations on EBS volumes Count/Second Multi DomainName, ClientId
WriteLatency The latency for write operations on EBS volumes Seconds Multi DomainName, ClientId
WriteThroughput The throughput for write operations on EBS volumes Bytes/Second Multi DomainName, ClientId