Dynatrace Synthetic Monitoring Retry on Error is a new feature that helps users better understand how their application is performing. Simply put if a synthetic test returns an error, the software will retain that result but automatically and immediately re-run the same test. The feature is easy to understand, and easy to use, and the information gained can be very valuable.
Sometimes you get the same results on the automatic retry – the test fails again. Sometimes you get different results; the test passes, and returns useful data. Sometimes there is a problem with your application, or with the Internet. Or sometimes there was an inexplicable glitch in transmission or in processing that will disappear the next time that you try that test.
Those are the scenarios that Retry on Error attempts to clarify. If the retried test succeeds, you can discount the failure as a false positive. If it fails yet again, that is an indication that there may be an issue with either the application or the Internet. Our statistics from using similar technology found in our Dynatrace Ruxit service is that retry on error can detect on the order of 45 percent of false positives if the application is running appropriately.
If an error is confirmed, then our Alerting engine has the capability of sending out an alert when there are one or more test failures (the actual parameters are configurable). Retry on Error makes it possible to have more accurate alerts over time.
Dynatrace can also go a step beyond that to analyze a suspected outage to determine the proximate cause. Outage Analyzer will help teams pinpoint where the problem may be.
Retry on Error: Foundation for the Future
Today, Retry on Error is effective for a single URL at a time with a single-step transaction. In the case of a failure, it retries the same test under the same conditions. The test either fails once again, or it succeeds. If it fails, that indicates that there is a problem that requires further investigation.
The value of Retry on Error is that it helps correct for false positives; if a failed test is successful on retry, it was likely do to a momentary glitch. We record that as a successful test, so that it’s not something you have to investigate further.
In the months to come, we are planning to enhance Retry on Error further so that it is even more useful. Users will be able to define more complex tests with multiple steps and multiple URLs, so that eventually any test will be able to make use of this feature.
We are also looking at a range of predictive analytics. Using Dynatrace expertise and knowledge regarding data going through individual Internet backbones, our goal is to be able to determine the likelihood whether a particular failure is a false positive, an Internet failure, or an application failure. This type of information should enable you to immediately focus on wherever the problem may be. For example, if it is your application you can turn to the root cause analysis feature and Dynatrace Application Monitoring to help diagnose and resolve this issue.
As you start applying synthetic monitoring Retry on Error, let us know how you’re using it. We want to hear your stories. We would like to know if you think we are heading in the right direction with this feature.