AWS Step Functions
Dynatrace ingests metrics for multiple preselected namespaces, including AWS Step Functions. You can view metrics for each service instance, split metrics into multiple dimensions, and create custom charts that you can pin to your dashboards.
Prerequisites
To enable monitoring for this service, you need
- An Environment or Cluster ActiveGate version 1.197+
- Dynatrace version 1.200+
- An updated AWS monitoring policy to include the additional AWS services.
To update the AWS IAM policy, use the JSON below, containing the monitoring policy (permissions) for all supporting services.
If you don't want to add permissions to all services, and just select permissions for certain services, consult the table below. The table contains a set of permissions that are required for all services (All monitored Amazon services) and, for each supporting service, a list of optional permissions specific to that service.
Example of JSON policy for one single service.
In this example, from the complete list of permissions you need to select
"apigateway:GET"
for Amazon API Gateway"cloudwatch:GetMetricData"
,"cloudwatch:GetMetricStatistics"
,"cloudwatch:ListMetrics"
,"sts:GetCallerIdentity"
,"tag:GetResources"
,"tag:GetTagKeys"
, and"ec2:DescribeAvailabilityZones"
for All monitored Amazon services.
Enable monitoring
To enable monitoring for this service, you first need to integrate Dynatrace with Amazon Web Services:
Add the service to monitoring
In order to view the service metrics, you must add the service to monitoring in your Dynatrace environment.
Beginning in early 2021, all cloud services will consume Davis Data Units (DDUs). The amount of DDU consumption per service instance depends on the number of monitored metrics and their dimensions (each metric dimension results in the ingestion of 1 data point; 1 data point consumes 0.001 DDUs). For DDU consumption estimates per service instance (recommended metrics only, predefined dimensions, and assumed dimension values), see DDU consumption estimates for per cloud service instance.
Monitor resources based on tags
You can choose to monitor resources based on existing AWS tags, as Dynatrace automatically imports them from service instances. Nevertheless, the transition from AWS to Dynatrace tagging isn't supported for all AWS services. Expand the table below to see which supporting services are filtered by tagging.
To monitor resources based on tags
- Go to Settings > Cloud and virtualization > AWS and select the AWS instance.
- For Resource monitoring method, select Monitor resources based on tags.
- Enter the Key and Value.
- Select Save.
Configure service metrics
Once you add a service, Dynatrace starts automatically collecting a suite of metrics for this particular service. These are recommended metrics.
Recommended metrics:
- Are enabled by default
- Can't be disabled
- Can have recommended dimensions (enabled by default, can't be disabled)
- Can have optional dimensions (disabled by default, can be enabled)
Apart from the recommended metrics, most services have the possibility of enabling optional metrics.
Optional metrics:
- Can be added and configured manually
View service metrics
Once you add the service to monitoring, you can view the service metrics in your Dynatrace environment either on your dashboard page or on the custom device overview page.
Import preset dashboards
Dynatrace provides preset AWS dashboards that you can import from GitHub to your environment's Dashboards page.
Note: To save a preset dashboard locally, create a new JSON file on your local machine and copy-paste the content of the JSON file from GitHub into the new file.
Once you save a preset dashboard locally, there are two ways to import it.
Available metrics
Name | Description | Unit | Statistics | Dimensions | Recommended |
---|---|---|---|---|---|
ActivitiesFailed | The number of failed activities | Count | Sum | Region, ActivityArn | ✔️ |
ActivitiesHeartbeatTimedOut | The number of activities that time out due to a heartbeat timeout | Count | Sum | Region, ActivityArn | ✔️ |
ActivitiesScheduled | The number of scheduled activities | Count | Sum | Region, ActivityArn | ✔️ |
ActivitiesStarted | The number of started activities | Count | Sum | Region, ActivityArn | |
ActivitiesSucceeded | The number of successfully completed activities | Count | Sum | Region, ActivityArn | ✔️ |
ActivitiesTimedOut | The number of activities that time out on close | Count | Sum | Region, ActivityArn | ✔️ |
ActivityRunTime | The interval, in milliseconds, between the time the activity starts and the time it closes | Milliseconds | Multi | Region, ActivityArn | ✔️ |
ActivityScheduleTime | The interval, in milliseconds, for which the activity stays in the schedule state | Milliseconds | Multi | Region, ActivityArn | |
ActivityTime | The interval, in milliseconds, between the time the activity is scheduled and the time it closes | Milliseconds | Multi | Region, ActivityArn | |
ConsumedCapacity | The count of requests per second | Count | Sum | Region, ServiceMetric | ✔️ |
ConsumedCapacity | Count | Sum | Region, APIName | ✔️ | |
ExecutionThrottled | The number of StateEntered events and retries that have been throttled | Count | Sum | Region, StateMachineArn | ✔️ |
ExecutionTime | The interval, in milliseconds, between the time the execution starts and the time it closes | Milliseconds | Multi | Region, StateMachineArn | ✔️ |
ExecutionsAborted | The number of aborted or terminated executions | Count | Sum | Region, StateMachineArn | ✔️ |
ExecutionsFailed | The number of failed executions | Count | Sum | Region, StateMachineArn | ✔️ |
ExecutionsStarted | The number of started executions | Count | Sum | Region, StateMachineArn | ✔️ |
ExecutionsSucceeded | The number of successfully completed executions | Count | Sum | Region, StateMachineArn | ✔️ |
ExecutionsTimedOut | The number of executions that time out for any reason | Count | Sum | Region, StateMachineArn | ✔️ |
LambdaFunctionRunTime | The interval, in milliseconds, between the time the Lambda function starts and the time it closes | Milliseconds | Multi | Region, LambdaFunctionArn | ✔️ |
LambdaFunctionScheduleTime | The interval, in milliseconds, for which the Lambda function stays in the schedule state | Milliseconds | Multi | Region, LambdaFunctionArn | |
LambdaFunctionTime | The interval, in milliseconds, between the time the Lambda function is scheduled and the time it closes | Milliseconds | Multi | Region, LambdaFunctionArn | |
LambdaFunctionsFailed | The number of failed Lambda functions | Count | Sum | Region, LambdaFunctionArn | ✔️ |
LambdaFunctionsScheduled | The number of scheduled Lambda functions | Count | Sum | Region, LambdaFunctionArn | ✔️ |
LambdaFunctionsStarted | The number of started Lambda functions | Count | Sum | Region, LambdaFunctionArn | |
LambdaFunctionsSucceeded | The number of successfully completed Lambda functions | Count | Sum | Region, LambdaFunctionArn | ✔️ |
LambdaFunctionsTimedOut | The number of Lambda functions that time out on close | Count | Sum | Region, LambdaFunctionArn | ✔️ |
ProvisionedBucketSize | The count of available requests per second | Count | Multi | Region, ServiceMetric | |
ProvisionedBucketSize | Count | Multi | Region, APIName | ||
ProvisionedRefillRate | The count of requests per second that are allowed into the bucket | Count | Multi | Region, ServiceMetric | |
ProvisionedRefillRate | Count | Multi | Region, APIName | ||
ServiceIntegrationRunTime | The interval, in milliseconds, between the time the service task starts and the time it closes | Milliseconds | Multi | Region, ServiceIntegrationResourceArn | ✔️ |
ServiceIntegrationScheduleTime | The interval, in milliseconds, for which the service task stays in the schedule state | Milliseconds | Multi | Region, ServiceIntegrationResourceArn | |
ServiceIntegrationTime | The interval, in milliseconds, between the time the service task is scheduled and the time it closes | Milliseconds | Multi | Region, ServiceIntegrationResourceArn | |
ServiceIntegrationsFailed | The number of failed service tasks | Count | Sum | Region, ServiceIntegrationResourceArn | ✔️ |
ServiceIntegrationsScheduled | The number of scheduled service tasks. | Count | Sum | Region, ServiceIntegrationResourceArn | ✔️ |
ServiceIntegrationsStarted | The number of started service tasks | Count | Sum | Region, ServiceIntegrationResourceArn | |
ServiceIntegrationsSucceeded | The number of successfully completed service tasks | Count | Sum | Region, ServiceIntegrationResourceArn | ✔️ |
ServiceIntegrationsTimedOut | The number of service tasks that time out on close | Count | Sum | Region, ServiceIntegrationResourceArn | ✔️ |
ThrottledEvents | The count of requests that have been throttled | Count | Sum | Region, ServiceMetric | ✔️ |
ThrottledEvents | Count | Sum | Region, APIName | ✔️ |
Limitations
Dynatrace gathers metrics for AWS Step Functions at the custom device group level instead of the custom device level (metrics are service-wide).