AWS Step Functions monitoring

Dynatrace ingests metrics for multiple preselected namespaces, including AWS Step Functions. You can view metrics for each service instance, split metrics into multiple dimensions, and create custom charts that you can pin to your dashboards.

Prerequisites

To enable monitoring for this service, you need

  • An Environment or Cluster ActiveGate version 1.197+
    Note: For role-based access (whether in a SaaS or Managed deployment), you need an Environment ActiveGate installed on an AWS EC2 host.
  • Dynatrace version 1.200+
  • An updated AWS monitoring policy to include the additional AWS services.

To update the AWS IAM policy, use the JSON below, containing the monitoring policy (permissions) for all supporting services.

If you don't want to add permissions to all services, and just select permissions for certain services, consult the table below. The table contains a set of permissions that are required for all services (All monitored Amazon services) and, for each supporting service, a list of optional permissions specific to that service.

Example of JSON policy for one single service.

In this example, from the complete list of permissions you need to select

  • "apigateway:GET" for Amazon API Gateway
  • "cloudwatch:GetMetricData", "cloudwatch:GetMetricStatistics", "cloudwatch:ListMetrics", "sts:GetCallerIdentity", "tag:GetResources", "tag:GetTagKeys", and "ec2:DescribeAvailabilityZones" for All monitored Amazon services.

Enable monitoring

To enable monitoring for this service, you first need to integrate Dynatrace with Amazon Web Services:

Add the service to monitoring

In order to view the service metrics, you must add the service to monitoring in your Dynatrace environment.

Cloud-service monitoring consumption

As of 2021, all cloud services consume Davis data units (DDUs). The amount of DDU consumption per service instance depends on the number of monitored metrics and their dimensions (each metric dimension results in the ingestion of 1 data point; 1 data point consumes 0.001 DDUs).

Monitor resources based on tags

You can choose to monitor resources based on existing AWS tags, as Dynatrace automatically imports them from service instances. Nevertheless, the transition from AWS to Dynatrace tagging isn't supported for all AWS services. Expand the table below to see which supporting services are filtered by tagging.

To monitor resources based on tags

  1. In the Dynatrace menu, go to Settings > Cloud and virtualization > AWS and select Edit for the desired AWS instance.
  2. For Resources to be monitored, select Monitor resources selected by tags.
  3. Enter the Key and Value.
  4. Select Save.

Configure service metrics

Once you add a service, Dynatrace starts automatically collecting a suite of metrics for this particular service. These are recommended metrics.

Recommended metrics:

  • Are enabled by default
  • Can't be disabled
  • Can have recommended dimensions (enabled by default, can't be disabled)
  • Can have optional dimensions (disabled by default, can be enabled)

Apart from the recommended metrics, most services have the possibility of enabling optional metrics.

Optional metrics:

  • Can be added and configured manually

View service metrics

You can view the service metrics in your Dynatrace environment either on the custom device overview page or on your Dashboards page.

View metrics on the custom device overview page

To access the custom device overview page

  1. In the Dynatrace menu, go to Technologies and processes.
  2. Filter by service name and select the relevant custom device group.
  3. Once you select the custom device group, you're on the custom device group overview page.
  4. The custom device group overview page lists all instances (custom devices) belonging to the group. Select an instance to view the custom device overview page.

View metrics on your dashboard

After you add the service to monitoring, a preset dashboard containing all recommended metrics is automatically listed on your Dashboards page. To look for specific dashboards, filter by Preset and then by Name.
aws-presets
Note: For existing monitored services, you might need to resave your credentials for the preset dashboard to appear on the Dashboards page. To resave your credentials, go to Settings > Cloud and virtualization > AWS, select the desired AWS instance, and then select Save.

You can't make changes on a preset dashboard directly, but you can clone and edit it. To clone a dashboard, open the browse menu () and select Clone.
To remove a dashboard from the dashboards page, you can hide it. To hide a dashboard, open the browse menu () and select Hide.
Note: Hiding a dashboard doesn't affect other users. clone-hide-aws

To check the availability of preset dashboards for each AWS service, see the list below.

step

Available metrics

Name Description Unit Statistics Dimensions Recommended
ActivitiesFailed The number of failed activities Count Sum Region, ActivityArn ✔️
ActivitiesHeartbeatTimedOut The number of activities that time out due to a heartbeat timeout Count Sum Region, ActivityArn ✔️
ActivitiesScheduled The number of scheduled activities Count Sum Region, ActivityArn ✔️
ActivitiesStarted The number of started activities Count Sum Region, ActivityArn
ActivitiesSucceeded The number of successfully completed activities Count Sum Region, ActivityArn ✔️
ActivitiesTimedOut The number of activities that time out on close Count Sum Region, ActivityArn ✔️
ActivityRunTime The interval, in milliseconds, between the time the activity starts and the time it closes Milliseconds Multi Region, ActivityArn ✔️
ActivityScheduleTime The interval, in milliseconds, for which the activity stays in the schedule state Milliseconds Multi Region, ActivityArn
ActivityTime The interval, in milliseconds, between the time the activity is scheduled and the time it closes Milliseconds Multi Region, ActivityArn
ConsumedCapacity The count of requests per second Count Sum Region, ServiceMetric ✔️
ConsumedCapacity Count Sum Region, APIName ✔️
ExecutionThrottled The number of StateEntered events and retries that have been throttled Count Sum Region, StateMachineArn ✔️
ExecutionTime The interval, in milliseconds, between the time the execution starts and the time it closes Milliseconds Multi Region, StateMachineArn ✔️
ExecutionsAborted The number of aborted or terminated executions Count Sum Region, StateMachineArn ✔️
ExecutionsFailed The number of failed executions Count Sum Region, StateMachineArn ✔️
ExecutionsStarted The number of started executions Count Sum Region, StateMachineArn ✔️
ExecutionsSucceeded The number of successfully completed executions Count Sum Region, StateMachineArn ✔️
ExecutionsTimedOut The number of executions that time out for any reason Count Sum Region, StateMachineArn ✔️
LambdaFunctionRunTime The interval, in milliseconds, between the time the Lambda function starts and the time it closes Milliseconds Multi Region, LambdaFunctionArn ✔️
LambdaFunctionScheduleTime The interval, in milliseconds, for which the Lambda function stays in the schedule state Milliseconds Multi Region, LambdaFunctionArn
LambdaFunctionTime The interval, in milliseconds, between the time the Lambda function is scheduled and the time it closes Milliseconds Multi Region, LambdaFunctionArn
LambdaFunctionsFailed The number of failed Lambda functions Count Sum Region, LambdaFunctionArn ✔️
LambdaFunctionsScheduled The number of scheduled Lambda functions Count Sum Region, LambdaFunctionArn ✔️
LambdaFunctionsStarted The number of started Lambda functions Count Sum Region, LambdaFunctionArn
LambdaFunctionsSucceeded The number of successfully completed Lambda functions Count Sum Region, LambdaFunctionArn ✔️
LambdaFunctionsTimedOut The number of Lambda functions that time out on close Count Sum Region, LambdaFunctionArn ✔️
ProvisionedBucketSize The count of available requests per second Count Multi Region, ServiceMetric
ProvisionedBucketSize Count Multi Region, APIName
ProvisionedRefillRate The count of requests per second that are allowed into the bucket Count Multi Region, ServiceMetric
ProvisionedRefillRate Count Multi Region, APIName
ServiceIntegrationRunTime The interval, in milliseconds, between the time the service task starts and the time it closes Milliseconds Multi Region, ServiceIntegrationResourceArn ✔️
ServiceIntegrationScheduleTime The interval, in milliseconds, for which the service task stays in the schedule state Milliseconds Multi Region, ServiceIntegrationResourceArn
ServiceIntegrationTime The interval, in milliseconds, between the time the service task is scheduled and the time it closes Milliseconds Multi Region, ServiceIntegrationResourceArn
ServiceIntegrationsFailed The number of failed service tasks Count Sum Region, ServiceIntegrationResourceArn ✔️
ServiceIntegrationsScheduled The number of scheduled service tasks. Count Sum Region, ServiceIntegrationResourceArn ✔️
ServiceIntegrationsStarted The number of started service tasks Count Sum Region, ServiceIntegrationResourceArn
ServiceIntegrationsSucceeded The number of successfully completed service tasks Count Sum Region, ServiceIntegrationResourceArn ✔️
ServiceIntegrationsTimedOut The number of service tasks that time out on close Count Sum Region, ServiceIntegrationResourceArn ✔️
ThrottledEvents The count of requests that have been throttled Count Sum Region, ServiceMetric ✔️
ThrottledEvents Count Sum Region, APIName ✔️

Limitations

Dynatrace gathers metrics for AWS Step Functions at the custom device group level instead of the custom device level (metrics are service-wide).