How Dynatrace empowers performance engineering teams to test at scale

As organizations develop more applications and microservices, they are discovering they also need to run more performance tests in the same amount of time or less to meet service-level objectives (SLOs) that fulfill service-level agreements (SLAs). But because of the complexity involved in executing and analyzing test results of dynamic systems, performance engineering is difficult to scale — especially with lean staff or resources. How can organizations address this process bottleneck and run more tests in less time?

At Perform 2021, Andi Grabner, Strategic Partners Director at Dynatrace, and Roman Ferstl, Managing Director at Triscon, outlined the performance testing challenges many organizations are experiencing today. Ferstl then shared how Triscon and ERGO*, one of the major insurance groups in Germany and Europe, were able to implement automated performance testing. Grabner also introduced four ways organizations can turbocharge their performance engineering with automation.

Current challenges with performance testing

According to the Dynatrace Autonomous Cloud survey, organizations are running into performance testing challenges in three areas: speed, quality, and scale. They report a 9:1 ratio of script maintenance versus creation and about a 90% rate of test reruns. Organizations are having trouble scaling, too — only 10% of their projects are tested.

Challenges of scaling performance engineering
Challenges of scaling performance engineering affect speed, quality, and scale.

Why are they grappling with these challenges? As Grabner explained, 80% percent of their time is spent on manual tasks, such as creating scripts, monitoring configurations, analyzing test results, and generating reports. Not surprisingly, they’re looking for a better approach. According to The State of Performance Engineering 2020 from Neotys and Sogeti (part of CapGemini), 56% of organizations want to implement “zero-touch” performance as a self-service.

Although some organizations have invested in automated pipelines, many are still struggling with analyzing functional tests and performance tests. Large quantities of unstructured monitoring data can slow down the process even further. Teams need to be able to quickly analyze and understand the results of their performance tests. To do so, teams can integrate performance testing as a self-service into the development process.

How Triscon and ERGO scaled performance testing with quality gates

For several years, Triscon, a Vienna-based firm dedicated to performance testing, has been implementing continually more sophisticated automated performance testing processes for ERGO, one of the leading insurance companies in Europe. “We now do 200-250 tests per year and serve more than 40 applications and services,” Ferstl said, adding that ERGO has increased its efficiency more than tenfold through automation.

Most recently, Triscon and ERGO implemented a proof of concept (PoC) for automated quality gates with Keptn. “The goal was to automate the entire process of performance testing, including the result analysis,” Ferstl explained. These Quality Gates allowed ERGO to identify performance issues at each stage of the development process.

SLO-based quality gates with Keptn
ERGO’s PoC for automated quality gates with Keptn was so successful, they plan to implmenet it for up to 480 fully automated tests per year, and performance tests as a self-service for development teams.

Because this PoC was so successful, ERGO now plans to automate quality gates for 10-20 existing microservices in the first half of 2021, eventually doubling its test execution to 240-480 fully automated tests per year. But the most exciting outcome is that ERGO will also offer performance tests as a self-service for development teams to get quick feedback in the early development stages.

ERGO also plans to include quality gates as part of onboarding for new microservices and use them to automate test evaluation for end-to-end tests. Because ERGO was so pleased with the results of this PoC, they now plan to make it part of their ongoing process. “All the services [we] onboard … in the future will have to have this process of automated quality gates,” Ferstl said.

Four ways you can turbocharge performance testing

As Grabner explained, organizations can use Dynatrace to turbocharge their performance testing in four ways: by automating monitoring, automating performance, automating root cause analysis, and establishing an SLO-driven culture.

4 initiatives for performance engineering
Establishing a SLO-driven culture makes it easy to automate monitoring, performance, and root-cause analysis.

1. Automating monitoring

Many organizations already have monitoring in place, which is a good first step. But they can accomplish even more by letting developers decide what metrics, dashboards, and insights they want to see as they’re pushing their code through each stage of the pipeline.

For example, with Dynatrace, you can create customized dashboards for staging that are targeted for performance engineers, and SLO dashboards that are designed for the production team. At each stage, automated monitoring gives people the power to decide what they need, so they can access the insights that help them make better decisions.

2. Automating performance

The next step is automating performance throughout the delivery pipeline. “When you’re integrating your performance tests with your pipeline and you have Dynatrace monitoring the applications on the load, you will be able to see all data with context in [Dynatrace],” Grabner said. “That means Dynatrace will be the place where you can see your metrics, your transactional data, everything in the context of a particular test transaction.”

Once you have all full-context data in one place, you can understand where your hot spots are and figure out what’s going on with your tests. By empowering your teams to specify SLOs at the test transaction level, you can bring your SLO-based test analysis to the next level.

3. Automating root cause analysis

You can also use automation to streamline the processes involved in root cause analysis. If you have integrated your testing tools with Dynatrace, your engineers can go directly into Dynatrace for valuable insight into what happened after a test has failed.

As Grabner explained, “You can say, ‘Dynatrace, show me where are the things that I can improve? Where are the things that are slow? What’s possible for me to improve my overall performance?'” Dynatrace also allows you to compare one build with another — right down to the database call or exception — so you can quickly pinpoint the root cause.

4. Establishing an SLO-driven culture

Automation can also help your engineers keep their SLOs top of mind from day one by letting them define their own SLIs and SLOs as YAML files. Dynatrace can then automatically push these configurations through the pipeline, using them to keep SLOs front and center at each stage of the development process.

For example, SLOs can act as Quality Gates during the development stage. During staging, teams can use SLOs to decide whether to move a new build into production. At the production stage, SLOs and AI-based risk mitigation tools can help organizations figure out whether they might be potentially violating SLOs or SLAs and exposing the organization to legal risk.

Turbocharge your performance testing with intelligent automation

For those exploring automated SRE-driven performance engineering for the first time, Ferstl explained that it’s important to evaluate where you stand and identify specific processes you can improve with automation. Once you have automated those processes, you can start implementing automated quality gates, beginning with a pilot project. Finally, you can move toward integrating quality gates throughout your CI/CD pipeline.

If your organization is running into challenges with performance testing at scale, you’re not alone. Many organizations are looking for efficient ways to run more tests in less time. As Grabner and Ferstl demonstrated, intelligent automation makes it possible to turbocharge your performance testing and transform even faster.

To learn exactly how performance engineering as a self-service works, watch the full Perform 2021 presentation using one of the local links below. And if you’re curious about how Dynatrace and Neotys integrate to eliminate manual processes around test-run setup and analysis, check out the on-demand Performance Clinic, Tutorial on SRE-Driven Performance Engineering with Neotys and Dynatrace.




Start a free trial!

Dynatrace is free to use for 15 days! The trial stops automatically, no credit card is required. Just enter your email address, choose your cloud location and install our agent.

*About ERGO:

ERGO is one of the major insurance groups in Germany and Europe. The Group is represented in around 30 countries worldwide, focusing mainly on Europe and Asia. ERGO offers a comprehensive range of insurances, pensions, investments and services. In its home market of Germany, ERGO ranks among the leading providers across all segments.

ITERGO Informationstechnologie GmbH is part of the global Tech Hub structure of ERGO Technology & Services Management AG (ET&SM).

Stay updated