We have all experienced the excitement of bringing a new technology or architectural pattern — like microservices — on the design table: “It is really powerful and exactly what we need to meet our challenges, so let’s start implementing it!

I feel the same way about microservices! It has a lot of potential and already has changed the way we think about development, automated testing and operations. However, introducing a new technology or architectural pattern does not mean that we must forget the best practices that we have learned in the past. Definitely not when it comes to performance and scalability!

In this blog I will take a closer look at the “grains of sand” microservice pattern which I recently observed in my work where we tried to migrate from a monolith to a monolith and microservice hybrid architecture. Let me show you how that pattern emerged, understand the impact and find out how it can easily be detected with the help of your (APM) Application Performance Monitoring solution.

From Monolith to Microservice: Project Introduction

The goal of our project was to extract one feature out of the monolithic application and move it to two brand new microservices. In the new architecture the monolithic calls the load-balanced, first-tier microservice, which then calls the load-balanced second-tier microservice. The monolithic changes are done by Team A, while the microservices are designed by Team B.

High level architectural overview showing the interaction of the monolithic application with the first and second tier micro-services
High-level architectural overview showing the interaction of the monolithic application with the first- and second-tier microservices

The First Load Test (JMeter)

After the teams went off to implement the changes we received a version that we could load test. The first load test is always an exciting moment for me as we finally get to learn whether the design table ideas actually work.

The load test analysis started with a big disappointment! The load test metrics showed a high impact on the web server and database response times.

To understand what happened, we analyzed Dynatrace AppMon PurePaths we captured while running the load test. The following screenshot is the Dynatrace Transaction Flow that immediately showed us where we had our architectural flaw:

  • One monolithic web request results into 20 micro service Tier 1 calls
  • The Tier 1 then makes approximately 400 micro service Tier 2 calls
  • This huge amount of calls results into a total of 4054 database statements
The Dynatrace Transaction Flow and the detailed PurePath data for every single request underneath made it easy for us to understand where our architecture had flaws!
The Dynatrace Transaction Flow and the detailed PurePath data for every single request underneath made it easy for us to understand where our architecture had flaws!

So what happened?

The microservice design turned out to be too fine grained. To get the job done the monolith needed to call microservice Tier 1 20 times! The same pattern was found in microservice Tier 1 which had to call micro service Tier 2 20 times in order to complete the task. Bringing this all together resulted in “a loop within a loop” behavior, with the identified consequences on web– and database server.

We simply had a too fine grained microservice architecture which is known as the “grains of sand” pattern. Maybe we should start calling our microservices nano-services from now on?

What’s the opinion of the infrastructure team?

When we discussed the load test results with the infrastructure team they raised this concern: “This new architecture threatens the health of our load balancer and firewall!” They were right! Nano service 1 & 2 are causing an exponential load increase (x400) as compared to our old, full-monolithic solution. This puts the health of the load balancer and firewall at risk.

Understanding the impact of architectural decisions to the underlying infrastructure is one of the key aspects and responsibilities of your DevOps strategy. It is about raising Ops-awareness to developers and the impact their actions will have:

Its important to understand the consequences of architectural decisions to the underlying supporting infrastructure. Make sure to do this analysis early on. Make sure to include your colleagues with Operations experience
Its important to understand the consequences of architectural decisions to the underlying supporting infrastructure. Make sure to do this analysis early on. Make sure to include your colleagues with Operations experience

To summarize the impact: more load on the load balancer and firewall, more load on the web server, and more database query load. And don’t forget: more bytes being sent through the network. If you move this architecture to an IaaS (Infrastructure as Service) you need to factor in the additional costs for every byte your application transmits!

Make sure to understand new message and traffic flow patterns when extracting monolithic code into micro-service like architectures.
Make sure to understand new message and traffic flow patterns when extracting monolithic code into microservice-like architectures.

Continuous Delivery: Was a load test the only way to identify this problem?

The “grains of sand” pattern is a nice example in where an easy functional test could have already detected the issue. Not only could this have been identified when using APM tools such as Dynatrace in the early development and prototype phase to validate what is really going on when executing the first manual or automated tests. Dynatrace AppMon also comes with a test automation feature that automatically detects these types of architectural regressions between your CI builds. Assuming you have a set of functional tests that ensure you don’t break any functionality – Dynatrace automatically tells you if the code and architectural changes didn’t introduce a pattern such as the “grains of sand” pattern. In the Dynatrace PurePath we would simply see this pattern as a dramatic increase in # of service calls, # of database calls or bytes sent/received over the network. The elegant solution of Dynatrace, which captures these details for every single transaction executed in all its details, allows us to identify this pattern even under NO load. Why? Because the pattern itself is in your code, and can be observed even with a single functional test.

How to get Dynatrace AppMon into your CI/CD?

For more details check out the documentation on Dynatrace Test Automation or the Performance Clinic on Shift-Left Performance on YouTube. Essentially the only thing you need to do is make sure the app you are testing is instrumented with Dynatrace. A small change in your test (Selenium, JMeter, SoapUI) will additionally make sure that Dynatrace understands what type of test you are executing because Dynatrace looks at these patterns by executed test scenario, allowing you to identify regressions on test case level:

Its really simply: I created a SoapUI  functional test to test the end user use case against the monolithic. First I execute it on the pure monolithic setup. Then I run the same test against the micro service setup. In the screenshot below you can see that Dynatrace AppMon automatically detects my test case “CallMicroService”. For every test it automatically captures all key architectural metrics such as # of SQL Executions (DB Count), Bytes Sent / Received, # of Log Messages Created, Response Time. The DB Count measure saw a huge increase between the test against the monolith vs the test against the micro-service setup. This is very powerful as Dynatrace automatically levels-up your existing functional tests and informs you of any architectural or performance regression. You can use this to stop a bad code change minutes after it was committed to your source code repository..

Dynatrace automatically highlights regressions based on architectural and performance metrics such as # of SQL, # of Bytes Sent/Received. Integrate this into your CI/CD and stop bad code changes minutes after commit!
Dynatrace automatically highlights regressions based on architectural and performance metrics such as # of SQL, # of Bytes Sent/Received. Integrate this into your CI/CD and stop bad code changes minutes after commit!

Continuous Delivery: Can we take it a step further?

Some time ago Dynatrace launched the “Share your PurePath” program. Many AppMon customers joined this program by sending their high-impact PurePaths. The most common performance problems patterns where selected based on insights obtained by the program. It will not surprise you that one problem pattern is the high database impact (ex. a high amount of queries). Starting with Dynatrace AppMon 6.5 these problem patterns are automatically detected. Below you can find an example of the Automatic Problem Pattern Detection spider web.

Dynatrace automates the detection of most common problem patterns in modern applications. This makes everyone a performance and architectural expert!
Dynatrace automates the detection of most common problem patterns in modern applications. This makes everyone a performance and architectural expert!

You could ask AppMon to show the pattern spider web for all the PurePaths related to a certain CD test. In that way you have an extra quality indication for your build. We can even automate this step and pull in the detected patterns through REST. This was made available with the Dynatrace AppMon 6.5.1 update.

We can either pull this data into the chosen build server such as Jenkins, Bamboo, Team City or Team Foundation Server. Or, we integrate the data into the very popular Hygieia DevOps Dashboard opened sourced by our friends from Capital One. Andi Grabner recently showed this in his Shift-Left Performance Clinic as well as during the PurePerformance Podcast Interview he did with Adam Auerbach from Capital One. This is a perfect example dashboard which can be extended with the automatic problem pattern detection information.

Important conclusions

  • Shift-Left performance! Try to find a big percentage of performance issues as soon as possible by running your functional tests for example, only allow “good builds” to make it to the later pipeline stages. Select important architectural, performance and scalability metrics and use them to detect regressions which can stop a build in your test pipeline. Dynatrace AppMon can support you to capture, base-lining and analyzing these metrics. You can even introduce the local free Dynatrace AppMon Personal License into your Dev / test team. This allows Devs validate their code performance before they do a check-in.
  • Introducing new technology’s also means reintroducing performance best practices.

Too many query’s, too many service calls, it seems trivial to try and prevent them. However the “grains of sand” example is a common seen pattern during the introduction of microservices. You can use the power of an APM solution to support you here. Dynatrace AppMon test automation dashboard, and the automatic problem pattern detection, are examples on how to do this.