Adaptive traffic management for Dynatrace SaaS
Dynatrace Full-Stack Monitoring brings value with a variety of features, which include distributed tracing for applications via the patented PurePath® technology. Each monitored application or microservice is constantly monitored and produces distributed traces, containing code-level and business insights, that are sent to Dynatrace.
Depending on the number of the application transactions and on the host units consumed by your environment, OneAgent captures a certain number of end-to-end traces per minute. When the volume of transactions is high, the amount of traces that could be captured by OneAgent might exceed the amount of trace volume available in your environment based on currently active host units.
Once the quota is reached, OneAgent starts sampling in the most effective way possible, via the intelligent mechanism of adaptive traffic management. The resulting capture rate is defined as OneAgent capture rate.
How is adaptive traffic management different from other sampling mechanisms?
In typical applications, the distribution of requests is not even. It's rather a combination of: a large number of unique URLs, a medium number of important requests, and, finally, a few kinds of requests that make up the majority of the traffic (for example, image requests or status checks).
With adaptive traffic management, OneAgent first calculates a list of top requests starting each minute, from which it then captures:
- Most traces of unique and rare requests.
- A significant but lower volume of highly frequent requests.
In this way, OneAgent reduces the data sent to your environment, ensuring that the amount of captured traces stays within the host-unit limits of your Dynatrace agreement. Because the sampling is not random, all important data is captured while maintaining a statistically valid sample set.
You can see the effect of adaptive traffic management in the distributed trace list. If OneAgent is sampling and not all requests are captured, then captured traces will point out that similar requests have not been captured with the message [amount] more like this
in the distributed trace list.
Using adaptive traffic management to reduce the volume of processed data results in saving a lot of network bandwidth.
Auto-adaptive quota
In Dynatrace SaaS, traffic management depends on the environment quota of allowed full-service call volume per minute. A single distributed trace can contain multiple full-service calls. The maximum amount of full-service calls per minute–and therefore traces per minute–that your environment can receive scales with your license as it's based on the amount of host units that are active in your environment.
Allowed full-service calls volume/minute = 250 full-service calls x active host units
This quota is maintained on the environment level and is shared across all monitored applications. In a sense, low-volume applications share their unused transaction volume with high-volume applications that need it.
Example: A moderate environment of 50 hosts with 32 GB each (= 100 host units) can process up to 25,000 full-service calls per minute.
Monitor adaptive traffic usage and thresholds
You can use the preset dashboard OneAgent Traces - Adaptive traffic management to track usage and thresholds of adaptive traffic management. Metric and charts provide insights into:
- Full-service calls per host unit
- Captured full-service calls
- OneAgent capture rate