Amazon MSK (Kafka)
Dynatrace ingests metrics for multiple preselected namespaces, including Amazon MSK (Kafka). You can view metrics for each service instance, split metrics into multiple dimensions, and create custom charts that you can pin to your dashboards.
Prerequisites
To enable monitoring for this service, you need
- An Environment or Cluster ActiveGate version 1.197+
- Dynatrace version 1.203+
- An updated AWS monitoring policy to include the additional AWS services.
To update the AWS IAM policy, use the JSON below, containing the monitoring policy (permissions) for all supporting services.
If you don't want to add permissions to all services, and just select permissions for certain services, consult the table below. The table contains a set of permissions that are required for all services (All monitored Amazon services) and, for each supporting service, a list of optional permissions specific to that service.
Example of JSON policy for one single service.
In this example, from the complete list of permissions you need to select
"apigateway:GET"
for Amazon API Gateway"cloudwatch:GetMetricData"
,"cloudwatch:GetMetricStatistics"
,"cloudwatch:ListMetrics"
,"sts:GetCallerIdentity"
,"tag:GetResources"
,"tag:GetTagKeys"
, and"ec2:DescribeAvailabilityZones"
for All monitored Amazon services.
Enable monitoring
To enable monitoring for this service, you first need to integrate Dynatrace with Amazon Web Services:
Add the service to monitoring
In order to view the service metrics, you must add the service to monitoring in your Dynatrace environment.
Beginning in early 2021, all cloud services will consume Davis Data Units (DDUs). The amount of DDU consumption per service instance depends on the number of monitored metrics and their dimensions (each metric dimension results in the ingestion of 1 data point; 1 data point consumes 0.001 DDUs). For DDU consumption estimates per service instance (recommended metrics only, predefined dimensions, and assumed dimension values), see DDU consumption estimates for per cloud service instance.
Monitor resources based on tags
You can choose to monitor resources based on existing AWS tags, as Dynatrace automatically imports them from service instances. Nevertheless, the transition from AWS to Dynatrace tagging isn't supported for all AWS services. Expand the table below to see which supporting services are filtered by tagging.
To monitor resources based on tags
- Go to Settings > Cloud and virtualization > AWS and select the AWS instance.
- For Resource monitoring method, select Monitor resources based on tags.
- Enter the Key and Value.
- Select Save.
Configure service metrics
Once you add a service, Dynatrace starts automatically collecting a suite of metrics for this particular service. These are recommended metrics.
Recommended metrics:
- Are enabled by default
- Can't be disabled
- Can have recommended dimensions (enabled by default, can't be disabled)
- Can have optional dimensions (disabled by default, can be enabled)
Apart from the recommended metrics, most services have the possibility of enabling optional metrics.
Optional metrics:
- Can be added and configured manually
View service metrics
Once you add the service to monitoring, you can view the service metrics in your Dynatrace environment either on your dashboard page or on the custom device overview page.
Import preset dashboards
Dynatrace provides preset AWS dashboards that you can import from GitHub to your environment's Dashboards page.
Note: To save a preset dashboard locally, create a new JSON file on your local machine and copy-paste the content of the JSON file from GitHub into the new file.
Once you save a preset dashboard locally, there are two ways to import it.
Available metrics
Name | Description | Unit | Statistics | Dimensions | Recommended |
---|---|---|---|---|---|
ActiveControllerCount | Only one controller per cluster should be active at any given time. | Count | Multi | Cluster Name | ✔️ |
ActiveControllerCount | Count | Sum | Cluster Name | ✔️ | |
BytesInPerSec | The number of bytes per second received from clients | Bytes/Second | Multi | Cluster Name, Broker ID | |
BytesInPerSec | Bytes/Second | Multi | Cluster Name, Broker ID, Topic | ||
BytesInPerSec | Bytes/Second | Sum | Cluster Name, Broker ID | ||
BytesInPerSec | Bytes/Second | Sum | Cluster Name, Broker ID, Topic | ||
BytesOutPerSec | The number of bytes per second sent to clients | Bytes/Second | Multi | Cluster Name, Broker ID | |
BytesOutPerSec | Bytes/Second | Multi | Cluster Name, Broker ID, Topic | ||
BytesOutPerSec | Bytes/Second | Sum | Cluster Name, Broker ID | ||
BytesOutPerSec | Bytes/Second | Sum | Cluster Name, Broker ID, Topic | ||
CPUCreditBalance | The number of earned credits | Count | Multi | Cluster Name, Broker ID | |
CPUCreditBalance | Count | Sum | Cluster Name, Broker ID | ||
CPUCreditUsage | The number of used credits | Count | Multi | Cluster Name, Broker ID | |
CPUCreditUsage | Count | Sum | Cluster Name, Broker ID | ||
CpuIdle | The percentage of CPU idle time | Percent | Multi | Cluster Name, Broker ID | ✔️ |
CpuIdle | Percent | Sum | Cluster Name, Broker ID | ✔️ | |
CpuSystem | The percentage of CPU in kernel space | Percent | Multi | Cluster Name, Broker ID | ✔️ |
CpuSystem | Percent | Sum | Cluster Name, Broker ID | ✔️ | |
CpuUser | The percentage of CPU in user space | Percent | Multi | Cluster Name, Broker ID | ✔️ |
CpuUser | Percent | Sum | Cluster Name, Broker ID | ✔️ | |
FetchConsumerLocalTimeMsMean | The mean time in milliseconds that the consumer request is processed at the leader | Milliseconds | Multi | Cluster Name, Broker ID | |
FetchConsumerLocalTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
FetchConsumerRequestQueueTimeMsMean | The mean time in milliseconds that the consumer request waits in the request queue | Milliseconds | Multi | Cluster Name, Broker ID | |
FetchConsumerRequestQueueTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
FetchConsumerResponseQueueTimeMsMean | The mean time in milliseconds that the consumer request waits in the response queue | Milliseconds | Multi | Cluster Name, Broker ID | |
FetchConsumerResponseQueueTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
FetchConsumerResponseSendTimeMsMean | Milliseconds | Multi | Cluster Name, Broker ID | ||
FetchConsumerResponseSendTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
FetchConsumerTotalTimeMsMean | The mean total time in milliseconds that consumers spend on fetching data from the broker | Milliseconds | Multi | Cluster Name, Broker ID | |
FetchConsumerTotalTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
FetchFollowerLocalTimeMsMean | The mean time in milliseconds that the follower request is processed at the leader | Milliseconds | Multi | Cluster Name, Broker ID | |
FetchFollowerLocalTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
FetchFollowerRequestQueueTimeMsMean | The mean time in milliseconds that the follower request waits in the request queue | Milliseconds | Multi | Cluster Name, Broker ID | |
FetchFollowerRequestQueueTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
FetchFollowerResponseQueueTimeMsMean | The mean time in milliseconds that the follower request waits in the response queue | Milliseconds | Multi | Cluster Name, Broker ID | |
FetchFollowerResponseQueueTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
FetchFollowerResponseSendTimeMsMean | The mean time in milliseconds for the follower to send a response | Milliseconds | Multi | Cluster Name, Broker ID | |
FetchFollowerResponseSendTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
FetchFollowerTotalTimeMsMean | The mean total time in milliseconds that followers spend on fetching data from the broker | Milliseconds | Multi | Cluster Name, Broker ID | |
FetchFollowerTotalTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
FetchMessageConversionsPerSec | The number of fetch message conversions per second for the broker | Count/Second | Multi | Cluster Name, Broker ID | |
FetchMessageConversionsPerSec | Count/Second | Multi | Cluster Name, Broker ID, Topic | ||
FetchMessageConversionsPerSec | Count/Second | Sum | Cluster Name, Broker ID | ||
FetchMessageConversionsPerSec | Count/Second | Sum | Cluster Name, Broker ID, Topic | ||
FetchMessageConversionsTimeMsMean | The mean total time in milliseconds that messages being fetched spend converting | Milliseconds | Multi | Cluster Name, Broker ID | |
FetchMessageConversionsTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
FetchThrottleByteRate | The number of throttled bytes per second | Bytes/Second | Multi | Cluster Name, Broker ID | |
FetchThrottleByteRate | Bytes/Second | Sum | Cluster Name, Broker ID | ||
FetchThrottleQueueSize | The number of messages in the throttle queue | Count | Multi | Cluster Name, Broker ID | |
FetchThrottleQueueSize | Count | Sum | Cluster Name, Broker ID | ||
FetchThrottleTime | The average fetch throttle time in milliseconds | Milliseconds | Multi | Cluster Name, Broker ID | |
FetchThrottleTime | Milliseconds | Sum | Cluster Name, Broker ID | ||
GlobalPartitionCount | Total number of partitions across all brokers in the cluster | Count | Multi | Cluster Name | ✔️ |
GlobalPartitionCount | Count | Sum | Cluster Name | ✔️ | |
GlobalTopicCount | Total number of topics across all brokers in the cluster | Count | Multi | Cluster Name | ✔️ |
GlobalTopicCount | Count | Sum | Cluster Name | ✔️ | |
KafkaAppLogsDiskUsed | The percentage of disk space used for application logs | Percent | Multi | Cluster Name, Broker ID | ✔️ |
KafkaAppLogsDiskUsed | Percent | Sum | Cluster Name, Broker ID | ✔️ | |
KafkaDataLogsDiskUsed | The percentage of disk space used for data logs | Percent | Multi | Cluster Name, Broker ID | ✔️ |
KafkaDataLogsDiskUsed | Percent | Sum | Cluster Name, Broker ID | ✔️ | |
LeaderCount | The number of leader replicas | Count | Multi | Cluster Name, Broker ID | |
LeaderCount | Count | Sum | Cluster Name, Broker ID | ||
MemoryBuffered | The size in bytes of buffered memory for the broker | Bytes | Multi | Cluster Name, Broker ID | ✔️ |
MemoryBuffered | Bytes | Sum | Cluster Name, Broker ID | ✔️ | |
MemoryCached | The size in bytes of cached memory for the broker | Bytes | Multi | Cluster Name, Broker ID | ✔️ |
MemoryCached | Bytes | Sum | Cluster Name, Broker ID | ✔️ | |
MemoryFree | The size in bytes of memory that is free and available for the broker | Bytes | Multi | Cluster Name, Broker ID | ✔️ |
MemoryFree | Bytes | Sum | Cluster Name, Broker ID | ✔️ | |
MemoryUsed | The size in bytes of memory that is in use for the broker | Bytes | Multi | Cluster Name, Broker ID | ✔️ |
MemoryUsed | Bytes | Sum | Cluster Name, Broker ID | ✔️ | |
MessagesInPerSec | The number of incoming messages per second for the broker | Count/Second | Multi | Cluster Name, Broker ID | |
MessagesInPerSec | Count/Second | Multi | Cluster Name, Broker ID, Topic | ||
MessagesInPerSec | Count/Second | Sum | Cluster Name, Broker ID | ||
MessagesInPerSec | Count/Second | Sum | Cluster Name, Broker ID, Topic | ||
NetworkProcessorAvgIdlePercent | The average percentage of the time the network processors are idle | Percent | Multi | Cluster Name, Broker ID | |
NetworkProcessorAvgIdlePercent | Percent | Sum | Cluster Name, Broker ID | ||
NetworkRxDropped | The number of dropped receive packages | Count | Multi | Cluster Name, Broker ID | ✔️ |
NetworkRxDropped | Count | Sum | Cluster Name, Broker ID | ✔️ | |
NetworkRxErrors | The number of network receive errors for the broker | Count | Multi | Cluster Name, Broker ID | ✔️ |
NetworkRxErrors | Count | Sum | Cluster Name, Broker ID | ✔️ | |
NetworkRxPackets | The number of packets received by the broker | Count | Multi | Cluster Name, Broker ID | ✔️ |
NetworkRxPackets | Count | Sum | Cluster Name, Broker ID | ✔️ | |
NetworkTxDropped | The number of dropped transmit packages | Count | Multi | Cluster Name, Broker ID | ✔️ |
NetworkTxDropped | Count | Sum | Cluster Name, Broker ID | ✔️ | |
NetworkTxErrors | The number of network transmit errors for the broker | Count | Multi | Cluster Name, Broker ID | ✔️ |
NetworkTxErrors | Count | Sum | Cluster Name, Broker ID | ✔️ | |
NetworkTxPackets | The number of packets transmitted by the broker | Count | Multi | Cluster Name, Broker ID | ✔️ |
NetworkTxPackets | Count | Sum | Cluster Name, Broker ID | ✔️ | |
OfflinePartitionsCount | Total number of partitions that are offline in the cluster | Count | Multi | Cluster Name | ✔️ |
OfflinePartitionsCount | Count | Sum | Cluster Name | ✔️ | |
PartitionCount | The number of partitions for the broker | Count | Multi | Cluster Name, Broker ID | |
PartitionCount | Count | Sum | Cluster Name, Broker ID | ||
ProduceLocalTimeMsMean | The mean time in milliseconds for the follower to send a response | Milliseconds | Multi | Cluster Name, Broker ID | |
ProduceLocalTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
ProduceMessageConversionsPerSec | The number of produce message conversions per second for the broker | Count/Second | Multi | Cluster Name, Broker ID | |
ProduceMessageConversionsPerSec | Count/Second | Multi | Cluster Name, Broker ID, Topic | ||
ProduceMessageConversionsPerSec | Count/Second | Sum | Cluster Name, Broker ID | ||
ProduceMessageConversionsPerSec | Count/Second | Sum | Cluster Name, Broker ID, Topic | ||
ProduceMessageConversionsTimeMsMean | The mean time in milliseconds spent on message format conversions | Milliseconds | Multi | Cluster Name, Broker ID | |
ProduceMessageConversionsTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
ProduceRequestQueueTimeMsMean | The mean time in milliseconds that request messages spend in the queue | Milliseconds | Multi | Cluster Name, Broker ID | |
ProduceRequestQueueTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
ProduceResponseQueueTimeMsMean | The mean time in milliseconds that response messages spend in the queue | Milliseconds | Multi | Cluster Name, Broker ID | |
ProduceResponseQueueTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
ProduceResponseSendTimeMsMean | The mean time in milliseconds spent on sending response messages | Milliseconds | Multi | Cluster Name, Broker ID | |
ProduceResponseSendTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
ProduceThrottleByteRate | The number of throttled bytes per second | Bytes/Second | Multi | Cluster Name, Broker ID | |
ProduceThrottleByteRate | Bytes/Second | Sum | Cluster Name, Broker ID | ||
ProduceThrottleQueueSize | The number of messages in the throttle queue | Count | Multi | Cluster Name, Broker ID | |
ProduceThrottleQueueSize | Count | Sum | Cluster Name, Broker ID | ||
ProduceThrottleTime | The average produce throttle time in milliseconds | Milliseconds | Multi | Cluster Name, Broker ID | |
ProduceThrottleTime | Milliseconds | Sum | Cluster Name, Broker ID | ||
ProduceTotalTimeMsMean | The mean produce time in milliseconds | Milliseconds | Multi | Cluster Name, Broker ID | |
ProduceTotalTimeMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ||
RequestBytesMean | The mean number of request bytes for the broker | Bytes | Multi | Cluster Name, Broker ID | |
RequestBytesMean | Bytes | Sum | Cluster Name, Broker ID | ||
RequestExemptFromThrottleTime | The average time in milliseconds spent in broker network and I/O threads to process requests that are exempt from throttling | Milliseconds | Multi | Cluster Name, Broker ID | |
RequestExemptFromThrottleTime | Milliseconds | Sum | Cluster Name, Broker ID | ||
RequestHandlerAvgIdlePercent | The average percentage of the time the request handler threads are idle | Percent | Multi | Cluster Name, Broker ID | |
RequestHandlerAvgIdlePercent | Percent | Sum | Cluster Name, Broker ID | ||
RequestThrottleQueueSize | The number of messages in the throttle queue | Count | Multi | Cluster Name, Broker ID | |
RequestThrottleQueueSize | Count | Sum | Cluster Name, Broker ID | ||
RequestThrottleTime | The average request throttle time in milliseconds | Milliseconds | Multi | Cluster Name, Broker ID | |
RequestThrottleTime | Milliseconds | Sum | Cluster Name, Broker ID | ||
RequestTime | The average time in milliseconds spent in broker network and I/O threads to process requests | Milliseconds | Multi | Cluster Name, Broker ID | |
RequestTime | Milliseconds | Sum | Cluster Name, Broker ID | ||
RootDiskUsed | The percentage of the root disk used by the broker | Percent | Multi | Cluster Name, Broker ID | ✔️ |
RootDiskUsed | Percent | Sum | Cluster Name, Broker ID | ✔️ | |
SwapFree | The size in bytes of swap memory that is available for the broker | Bytes | Multi | Cluster Name, Broker ID | ✔️ |
SwapFree | Bytes | Sum | Cluster Name, Broker ID | ✔️ | |
SwapUsed | The size in bytes of swap memory that is in use for the broker | Bytes | Multi | Cluster Name, Broker ID | ✔️ |
SwapUsed | Bytes | Sum | Cluster Name, Broker ID | ✔️ | |
UnderMinIsrPartitionCount | The number of under minIsr partitions for the broker | Count | Multi | Cluster Name, Broker ID | |
UnderMinIsrPartitionCount | Count | Sum | Cluster Name, Broker ID | ||
UnderReplicatedPartitions | The number of under-replicated partitions for the broker | Count | Multi | Cluster Name, Broker ID | |
UnderReplicatedPartitions | Count | Sum | Cluster Name, Broker ID | ||
ZooKeeperRequestLatencyMsMean | Mean latency in milliseconds for ZooKeeper requests from broker | Milliseconds | Multi | Cluster Name, Broker ID | ✔️ |
ZooKeeperRequestLatencyMsMean | Milliseconds | Multi | Cluster Name | ✔️ | |
ZooKeeperRequestLatencyMsMean | Milliseconds | Sum | Cluster Name, Broker ID | ✔️ | |
ZooKeeperRequestLatencyMsMean | Milliseconds | Sum | Cluster Name | ✔️ | |
ZooKeeperSessionState | Connection status of broker's ZooKeeper session which may be one of the following: NOT_CONNECTED : 0.0 , ASSOCIATING : 0.1 , CONNECTING : 0.5 , CONNECTEDREADONLY : 0.8 , CONNECTED : 1.0 , CLOSED : 5.0 , AUTH_FAILED : 10.0 . |
Count | Multi | Cluster Name, Broker ID | ✔️ |
ZooKeeperSessionState | Count | Multi | Cluster Name | ✔️ | |
ZooKeeperSessionState | Count | Sum | Cluster Name, Broker ID | ||
ZooKeeperSessionState | Count | Sum | Cluster Name |