Amazon ElastiCache monitoring
Dynatrace ingests metrics for multiple preselected namespaces, including Amazon ElastiCache. You can view metrics for each service instance, split metrics into multiple dimensions, and create custom charts that you can pin to your dashboards.
Prerequisites
To enable monitoring for this service, you need
-
ActiveGate version 1.181+, as follows:
-
For Dynatrace SaaS deployments, you need an Environment ActiveGate or a Multi-environment ActiveGate.
-
For Dynatrace Managed deployments, you can use any kind of ActiveGate.
For role-based access (whether in a SaaS or Managed deployment), you need an Environment ActiveGate installed on an Amazon EC2 host.
-
-
Dynatrace version 1.182+
-
An updated AWS monitoring policy to include the additional AWS services.
To update the AWS IAM policy, use the JSON below, containing the monitoring policy (permissions) for all supporting services.
If you don't want to add permissions to all services, and just select permissions for certain services, consult the table below. The table contains a set of permissions that are required for all services (All monitored Amazon services) and, for each supporting service, a list of optional permissions specific to that service.
Example of JSON policy for one single service.
In this example, from the complete list of permissions you need to select
"apigateway:GET"
for Amazon API Gateway"cloudwatch:GetMetricData"
,"cloudwatch:GetMetricStatistics"
,"cloudwatch:ListMetrics"
,"sts:GetCallerIdentity"
,"tag:GetResources"
,"tag:GetTagKeys"
, and"ec2:DescribeAvailabilityZones"
for All monitored Amazon services.
Enable monitoring
To enable monitoring for this service, you first need to integrate Dynatrace with Amazon Web Services:
Add the service to monitoring
In order to view the service metrics, you must add the service to monitoring in your Dynatrace environment.
Once AWS cloud services are added to monitoring, you might have to wait 15-20 minutes before the metric values are displayed.
All cloud services consume Davis data units (DDUs). The amount of DDU consumption per service instance depends on the number of monitored metrics and their dimensions (each metric dimension results in the ingestion of 1 data point; 1 data point consumes 0.001 DDUs).
Monitor resources based on tags
You can choose to monitor resources based on existing AWS tags, as Dynatrace automatically imports them from service instances. Nevertheless, the transition from AWS to Dynatrace tagging isn't supported for all AWS services. Expand the table below to see which cloud services are filtered by tagging.
To monitor resources based on tags
- In the Dynatrace menu, go to Settings > Cloud and virtualization > AWS and select Edit for the desired AWS instance.
- For Resources to be monitored, select Monitor resources selected by tags.
- Enter the Key and Value.
- Select Save.
Configure service metrics
Once you add a service, Dynatrace starts automatically collecting a suite of metrics for this particular service. These are recommended metrics. Apart from the recommended metrics, most services have the possibility of enabling optional metrics. You can remove or edit any of the existing metrics or any of their dimensions, where there are multiple dimensions available. Metrics consisting of only one dimension can't be edited. They can only be removed or added.
Service-wide metrics are metrics for the whole service across all regions. Typically, these metrics include dimensions containing Region in their name. If selected, these metrics are displayed on a separate chart when viewing your AWS deployment in Dynatrace. Keep in mind that available dimensions differ among services.
To change a metric's statistics, you have to recreate that metric by choosing different statistics. You can choose among the following statistics: Sum, Minimum, Maximum, Average, and Sample count. The Average + Minimum + Maximum statistics enable you to collect all three statistics as one metric instead of one statistic for three metrics separately. This can reduce your expenses for retrieving metrics from your AWS deployment.
To be able to save a newly added metric, you need to select at least one statistic and one dimension.
Once AWS cloud services are configured, you might have to wait 15-20 minutes before the metric values are displayed.
View service metrics
You can view the service metrics in your Dynatrace environment either on the custom device overview page or on your Dashboards page.
View metrics on the custom device overview page
To access the custom device overview page
- In the Dynatrace menu, go to Technologies and processes.
- Filter by service name and select the relevant custom device group.
- Once you select the custom device group, you're on the custom device group overview page.
- The custom device group overview page lists all instances (custom devices) belonging to the group. Select an instance to view the custom device overview page.
View metrics on your dashboard
You can also view metrics in the Dynatrace web UI on dashboards. There is no preset dashboard available for this service, but you can create your own dashboard.
To check the availability of preset dashboards for each AWS service, see the list below.
Available metrics
Name | Description | Unit | Statistics | Dimensions | Recommended |
---|---|---|---|---|---|
ActiveDefragHits | The number of value reallocations per minute performed by the active defragmentation process. This is derived from active_defrag_hits statistic. | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
BytesReadIntoMemcached | The number of bytes that have been read from the network by the cache node | Bytes | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
BytesUsedForCache | The total number of bytes allocated by Redis for all purposes, including the dataset, buffers, and so on. This is derived from used_memory statistic. | Bytes | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
BytesUsedForCacheItems | The number of bytes used to store cache items | Bytes | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
BytesUsedForHash | The number of bytes currently used by hash tables | Bytes | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
BytesWrittenOutFrom | The number of bytes that have been written to the network by the cache node | Bytes | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
CPUUtilization | The percentage of CPU utilization for the entire host | Percent | Multi | CacheClusterId | |
CPUUtilization | Percent | Multi | CacheClusterId, CacheNodeId | ||
CacheHits | The number of successful read-only key lookups in the main dictionary. This is derived from keyspace_hits statistic. | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
CacheMisses | The number of unsuccessful read-only key lookups in the main dictionary. This is derived from keyspace_misses statistic. | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
CasBadval | The number of CAS (check and set) requests the cache has received where the CAS value did not match the CAS value stored | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
CasHits | The number of CAS requests the cache has received where the requested key was found and the CAS value matched | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
CasMisses | The number of CAS requests the cache has received where the key requested was not found | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
CmdConfigGet | The cumulative number of config get requests | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
CmdConfigSet | The cumulative number of config set requests | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
CmdFlush | The number of flush commands the cache has received | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
CmdGet | The number of get commands the cache has received | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
CmdSet | The number of set commands the cache has received | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
CmdTouch | The cumulative number of touch requests | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
CurrConfig | The current number of configurations stored | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
CurrConnections | A count of the number of connections connected to the cache at an instant in time. ElastiCache uses two to three of the connections to monitor the cluster. | Count | Multi | CacheClusterId; CacheClusterId, CacheNodeId | |
CurrConnections | Count | Multi | CacheClusterId, CacheNodeId | ||
CurrItems | A count of the number of items currently stored in the cache | Count | Multi | CacheClusterId; CacheClusterId, CacheNodeId | |
DecrHits | The number of decrement requests the cache has received where the requested key was found | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
DecrMisses | The number of decrement requests the cache has received where the requested key was not found | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
DeleteHits | The number of delete requests the cache has received where the requested key was found | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
DeleteMisses | The number of delete requests the cache has received where the requested key was not found. | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
EngineCPUUtilization | Provides CPU utilization of the Redis engine thread | Percent | Multi | CacheClusterId; CacheClusterId, CacheNodeId | |
EvictedUnfetched | The number of valid items evicted from the least recently used cache (LRU) which were never touched after being set | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
Evictions | The number of non-expired items the cache evicted to allow space for new writes | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
Evictions | Count | Sum | CacheClusterId, CacheNodeId | ||
ExpiredUnfetched | The number of expired items reclaimed from the LRU which were never touched after being set | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
FreeableMemory | The amount of free memory available on the host. This is derived from the RAM, buffers, and cache that the OS reports as freeable. | Bytes | Multi | CacheClusterId; CacheClusterId, CacheNodeId | |
GetHits | The number of get requests the cache has received where the key requested was found | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
GetMisses | The number of get requests the cache has received where the key requested was not found | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
GetTypeCmds | The total number of read-only type commands. This is derived from the Redis commandstats statistic by summing all of the read-only type commands (get, hget, scard, lrange, and so on). | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
HashBasedCmds | The total number of commands that are hash-based. This is derived from the Redis commandstats statistic by summing all of the commands that act upon one or more hashes (hget, hkeys, hvals, hdel, and so on). | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
HyperLogLogBasedCmds | The total number of HyperLogLog-based commands | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
IncrHits | The number of increment requests the cache has received where the key requested was found | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
IncrMisses | The number of increment requests the cache has received where the key requested was not found | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
KeyBasedCmds | The total number of commands that are key-based. This is derived from the Redis commandstats statistic by summing all of the commands that act upon one or more keys across multiple data structures (del, expire, rename, and so on). | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
ListBasedCmds | The total number of commands that are list-based. This is derived from the Redis commandstats statistic by summing all of the commands that act upon one or more lists (lindex, lrange, lpush, ltrim, and so on). | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
NetworkBytesIn | The number of bytes the host has read from the network | Bytes | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
NetworkBytesOut | The number of bytes sent out on all network interfaces by the instance | Bytes | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
NewConnections | The number of new connections the cache has received. This is derived from the memcached total_connections statistic by recording the change in total_connections across a period of time. | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
NewItems | The number of new items the cache has stored. This is derived from the memcached total_items statistic by recording the change in total_items across a period of time. | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
Reclaimed | The number of expired items the cache evicted to allow space for new writes | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
ReplicationBytes | For nodes in a replicated configuration, ReplicationBytes reports the number of bytes that the primary is sending to all of its replicas. This metric is representative of the write load on the replication group. This is derived from the master_repl_offset statistic. | Bytes | Multi | CacheClusterId; CacheClusterId, CacheNodeId | |
ReplicationLag | This metric is only applicable for a node running as a read replica. It represents how far behind, in seconds, the replica is in applying changes from the primary node. | Seconds | Multi | CacheClusterId; CacheClusterId, CacheNodeId | |
SaveInProgress | This binary metric returns 1 whenever a background save (forked or forkless) is in progress, and 0 otherwise. | Count | Multi | CacheClusterId; CacheClusterId, CacheNodeId | |
SetBasedCmds | The total number of commands that are set-based. This is derived from the Redis commandstats statistic by summing all of the commands that act upon one or more sets (scard, sdiff, sadd, sunion, and so on). | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
SetTypeCmds | The total number of write types of commands. This is derived from the Redis commandstats statistic by summing all of the mutative types of commands that operate on data (set, hset, sadd, lpop, and so on). | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
SlabsMoved | The total number of slab pages that have been moved | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
SortedSetBasedCmds | The total number of commands that are sorted set-based. This is derived from the Redis commandstats statistic by summing all of the commands that act upon one or more sorted sets (zcount, zrange, zrank, zadd, and so on). | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
StringBasedCmds | The total number of commands that are string-based. This is derived from the Redis commandstats statistic by summing all of the commands that act upon one or more strings (strlen, setex, setrange, and so on). | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
SwapUsage | The amount of swap used on the host | Bytes | Multi | CacheClusterId | |
SwapUsage | Bytes | Multi | CacheClusterId, CacheNodeId | ||
TouchHits | The number of keys that have been touched and were given a new expiration time | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
TouchMisses | The number of items that have been touched, but were not found | Count | Sum | CacheClusterId; CacheClusterId, CacheNodeId | |
UnusedMemory | The amount of memory not used by data. This is derived from the Memcached statistics limit_maxbytes and bytes by subtracting bytes from limit_maxbytes. | Bytes | Sum | CacheClusterId; CacheClusterId, CacheNodeId |