PurePath® method-level transaction analysis
The first step in your analysis should be to locate those requests that you want to analyze at a granular level. Filters can be used to narrow down many thousands of requests to just those few requests that are relevant to your analysis. This can be achieved during problem analysis by following the root cause analysis drill-downs (to begin problem analysis, select Problems in the Dynatrace menu), or manually by segmenting your requests using the advanced filtering mechanisms in the service flow and outlier analysis. Ultimately, you’ll likely discover a handful of requests that require deeper analysis. This is where distribute traces analysis comes in.
Detection of a single request
In the example below, the service
easyTravel Customer Frontend received 175,000 service requests during the selected 2-hour timeframe. This is an unwieldy number of requests to work with, so we need to narrow down these results.
Suppose that we’re interested specifically in those requests that call the
AuthenticationService, and then
easyTravel-Business MongoDB. As this service doesn’t have a significant contribution in the overall response time, it's enclosed within the aggregated entry called 6 services. By clicking this specific aggregate and selecting AuthenticationService from the list, we see that there are only 752 of these.
Note that the
easyTravel-Business services have 2,580 and 3,310 requests respectively. However, these are overall calls from many services, not just
easyTravel Customer Frontend.
To focus on this subset of requests, select the last service in the desired chain of service calls that you want to analyze—
easyTravel-Business in our case, and click the Filter service flow button.
Now we see the selected subset of services. This fraction indicates that we're looking only at transactions where
easyTravel Customer Frontend calls
AuthenticationService, which in turn calls
As you can see, 75% of the
easyTravel Customer Frontend requests that call
AuthenticationService also call
VerificationService. These are the requests we want to focus our analysis on. Therefore, let’s add
VerificationService as a second filter parameter to further narrow down the analysis. To do this, select VerificationService node and then select the View distributed traces in easyTravel Customer Frontend box in the rightmost pane to access the list of distributed traces that are initiated by
easyTravel Customer Frontend.
The View distributed traces button from the VerificationService box shows you the distributed traces that were initiated by
VerificationService, as this is the currently selected service in the service flow.
For the given timeframe, you can see the last 100 single requests that were initiated by the service under analysis (the
easyTravel Customer Frontend service in this example) that also match the filter criteria. You can sort these single requests by Response time, CPU time, Method, or Response code. To view the distributed trace analysis of a single request, select the Analyze menu > Trace.
In the image above, notice the filter visualization in the upper-left corner. The provided list of distributed traces includes only those requests to
easyTravel Customer Frontend that call both
VerificationService. But the list is still too large—we only need to analyze the slower requests.
Let’s modify the filter on the easyTravel Customer Frontend node so that only those requests that have Response time slower than
500 ms are displayed. To do so,
- Click the easyTravel Customer Frontend node.
- From the Create filter for list, select Response time.
- In the from input field, type
- Click Add.
- Click Apply.
As you can see in the following image, after applying the response time filter, we’ve identified 5 requests out of initial 175,000 that justify in-depth distributed trace analysis. To begin analyzing the distributed traces of a request, select a request and then select Distributed traces.
Distributed traces analysis of a single web request
Distributed traces analysis provides a waterfall visualization of all requests. Each service in the call chain is represented in the analysis.
In the Execution breakdown section of the example distributed traces analysis, you can see that the whole transaction consumes about 32.8 ms of CPU time, spends 559 ms in suspension, and 74.3 ms someplace else.
However, the waterfall chart shows much more detail. The waterfall indicates which other services are called and in which order. We can see each call to the
Verification services. We also see the subsequent calls to the MongoDB service that were made by
Authentication service requests. Distributed traces analysis, like the service flow, provides end-to-end web-request visualizations—in this case, that of a single request.
The colors and positions of the horizontal bars in the chart indicate both the sequence and response time of each of the requests. You can easily see which calls were executed synchronously and which were executed in parallel.
According to the example above, most of the time of this particular request was spent on the client side of the
getLoyaltyStatus web service call. As indicated by the colors of the bars in the chart, the time was not spent on the server side of this web service but rather on the client side. If we were to investigate this call further, we would find underlying network latency.
By selecting one of the service or execution bars, you can get even more detail. You can analyze the details of each request in the distributed traces analysis. In the example below, you can see the web-request details of the main request. Such details include metadata, request headers, request parameters, and more. You can even see information about the proxy that this request was sent through.
Notice that some values are obscured with five asterisks (
*****). This is because these are confidential values and the current user account doesn’t have permission to view confidential values. These values would be visible if the active user account had permission to view these values.
Note difference between hidden personal data and data obscured for aggregation purposes.
Five asterisks (
*****) mean obscured confidential data, which can be unmasked by any user with sufficient permissions. Three or fewer asterisks (
***) are used for data aggregation purposes, and can't be unmasked.
Besides the information provided in the Summary tab, you can also view more timing-specific details. Just select the Timing tab for the service call you're interested in. The example below shows timing details for the
authenticate web service call. In this case, we see that the request lasts 583ms on the calling side, but only 1.11ms on the server side. Here again, there is significant network latency.
Code execution details of individual requests
Each request executes some code, for example, Java, .NET, PHP, Node.js, Apache webserver, NGINX, IIS, or something else. The distributed traces view enables you to look at the code execution of each and every request. Simply select a particular service and select the Code level tab.
This view shows you code-level method executions and their timings. Dynatrace tells you the exact sequence of events with all respective timings (for example, CPU, wait, sync, lock, and execution time). As you can see above, Dynatrace tells you exactly which method in the
orange.jsf request on the easyTravel Customer Frontend called the respective web services and which specific web-service methods were called. The timings displayed here are the timings as experienced by the
easyTravel Customer Frontend, which, in the case of calls to services on remote tiers, represent the client time.
Notice that some execution trees are more detailed than others. Some contain full-stack traces. Dynatrace automatically adapts the level of information it captures based on importance, timing, and estimated overhead. Because of this, slower parts of a request typically contain more information than faster parts.
You can look at any request in the distributed traces view and navigate between the respective code-level trees. This gives you access to the full execution tree.
Different teams, different perspectives
Each distributed trace tracks a request from start to finish. This means that the traces always start at the first fully monitored process group. However, just because a request starts at the
easyTravel Customer Frontend service doesn’t mean that this is the service you’re interested in. For example, if you’re responsible for
AuthenticationService, it makes more sense for you to analyze requests from the perspective of
Let’s look at the same flow once again, but this time we’ll look at the requests of
AuthenticationService directly. To do so, go back to the service flow, select the AuthenticationService node, and select Distributed traces in the AuthenticationService box.
We can additionally add a response time filter on the AuthenticationService node, to only view requests slower than
50ms. With this adjustment, the list now only shows requests of
AuthenticationService that are slower than
50ms that are called by the
Customer Frontend service, and at the time when the front-end request also calls
Now we can analyze
AuthenticationService without including the
easyTravel Customer Frontend service in the analysis. This is useful if you’re responsible for a service that is called by the services developed by other teams.
Of course, if required, we can use service backtrace at any time to see where this request originated from. To do so, select the Analyze tab of the trace details, and select Analyze backtrace.
We can then choose to once again look at the same distributed trace from the perspective of the
easyTravel Customer Frontend service. To do so, go to the Analyze tab of the Analysis of easyTravel Customer Frontend request section and select Distributed traces in the Analyze selection 'easyTravel Customer Frontend' column.
This is the same distributed trace we began our analysis with. You can still see the
Authenticate call and its two database calls, but now the call is embedded in a larger request.