No matter how often you deploy your application or how sophisticated your delivery pipeline is, you always need to know the quality status of the software you are building. That can only be done if you measure it; but measure what exactly? In Part I we introduced the Concept of Quality Metrics in CD (Continuous Delivery) by looking at the metric # of Requests per End User Action. In Part II we will focus on metrics around database access.

You need to be aware of bad database access patterns right when they get introduced in your code. Whether the reason is incorrectly configured O/R (Object Relational) Mappers such as Hibernate, TopLink or JDO, or because of bad coding. Finding these problems immediately by looking at the right metrics will make it is easier for Developers to fix the problem, which will reduce test cycles and give Operations more confidence that a new deployment will not blow their current database server.

Examples of Bad Database Access Patterns

The following example is taken from a web application that displays an account report with 25 items per page which contains the names and high level statuses of these accounts. The Developers decided to go with Hibernate in order to access these account objects stored in the database. A closer look at the generated SQL queries when generating this report reveals that Hibernate not only loads the account objects that match the search query, but actually loads all account objects including all referenced objects in a related history table. These history objects are not needed at all to generate the report. The way Hibernate is used by the application (premature loading of all objects and referenced objects) results in more than 4000 SQL executions contributing 6s to the total page load time:

The way Hibernate is used by the application results in 4k+ individual SQL Statements and with that returning much more data than is actually needed for the report
The way Hibernate is used by the application results in 4k+ individual SQL Statements, returning much more data than is actually needed for the report

If you want to learn more about database access problems check out load balancers cause database locks, when it is really the database to blame or the “Understanding Hibernate”  Series: Part I – Session Cache, Part II – Query Cache and Part III – Second Level Cache.

Metric: Total Number of SQL Statements per Transaction

The first metric you want to take a closer look at is the total number of SQL executions per transaction. If you want to go a step further you can even monitor SELECT, INSERT, DELETE and UPDATE statements separately.

If you’re always aware how many database statements are executed for your individual transactions (Login, Search, Checkout) and you monitor this along the delivery pipeline for every build, you will immediately see how the newly added functionality impacts the load on your database. The following screenshot shows a way to track this number across builds and across your different deployment stages. In this scenario, Developers extended the search feature in Build #3 by making an additional call to a 3rd party recommendation service. Build #3 suddenly shows a huge spike in SQL queries in the Load Stage and Production. Why is that?

A new call to an external 3rd party service introduced with Build 3 has major impacts on the load (capacity stage) and production environment when this new feature has to deal with real production data
A new call to an external 3rd party service introduced with Build 3 has major impacts on the load (capacity stage) and production environment when this new feature has to deal with real production data

What can we learn from these metrics above?

  • Commit stage: The executed Unit Tests in that stage didn’t catch the problem because the call to the 3rd party service was mocked and therefore no actual DB calls were executed by that service.
  • Acceptance stage: A change in behavior was detected due to the additional call to the recommendation service which executed 2 additional SQLs. This could already be a warning sign but will probably still stay unnoticed.
  • Capacity stage: a 200x increase in DB calls must stop this deployment so that it never reaches Live. The reason why we have 200x and not only 2 more queries as compared to the Acceptance Stage is because the database used in Load is more “production like” than the test database used for Integration Tests. This fact should also trigger the integration tests to be executed against more than sample database content.
  • Production: 4k SQLs instead of just 2 is a huge impact on production. This again can be explained because the production database has “real life” data and the database access pattern of this 3rd party service queries every piece of data. If this really makes it into production and we monitor this data down to the transaction level it is easy to pull this change back and engineering can immediately start working on the problematic area. Most importantly, however, is that this Build never makes it into Production because the problem was found in testing already!

How to Measure on Dev Workstations

Developers can look at this data by either turning on certain SQL Logging options of the frameworks that they are using, e.g: Hibernate (see stackoverflow discussion). They can profile their code using the profiler that comes with their IDE or use tools such as the Development Edition of Dynatrace (or the Free Trial) to see all database calls made by their own code or the code that they are calling. The following shows a screenshot taken from Database Access Patterns Gone Wild and shows which data can be analyzed on a local machine:

Developers can analyze which SQL statements are executed by their own code or 3rd party frameworks they use. In this case it was code executed by Telerik to populate .NET control data.
Developers can analyze which SQL statements are executed by their own code or 3rd party frameworks they use. In this case it was code executed by Telerik to populate .NET control data.

How to Measure in the Commit stage

Automation Engineers can use the same tools and mechanisms in combination with their testing tools.

The key is to capture these metrics for every build and then “learn” what the expected value for DB access should be. If you have a smart enough system, it learns this for you and alerts you in case these metrics are “out of the norm” for your individual tests. The following screenshot taken from dynaTrace Test Center Edition shows an example where the DB Count Metric for the testLogin test jumped from 3 (expected based on historical data) to 70!

Automatically identify regressions from build to build by looking at metrics such as # of SQL Executions
Automatically identify regressions from build to build by looking at metrics such as # of SQL Executions

How to Measure in the Capacity stage

Performance Engineers use Load Testing Tools which typically don’t provide this metric out of the box as they are mainly reporting on Page Load Time or Number of Transactions. Therefore it is important to extend your load testing reporting by integrating it with the tools used in CI or the tools used by developers to capture that metric. Looking at SQL Log files would be an alternative option but because it lacks the context to the executing transaction (Search, Login, …) you won’t know where these calls come from all of a sudden.

The following is a screenshot from dynaTrace that provides this data for each individual tested transaction – making it easy to figure out how many database statements are actually processed by the test:

Direct access to all database queries executed by the individual transactions.
Direct access to all database queries executed by the individual transactions

Besides looking at a single test result, you also want to compare the results of two tests like shown in the following screenshot. You want to know whether there was a change in # of SQLs when you put the application under load:

Comparing the number of SQL executions between two test runs gives you confidence that you did not introduce a regression
Comparing the number of SQL executions between two test runs gives you confidence that you did not introduce a regression

How to Measure in Production

Even though everything is tested well, you need to monitor the same metric in production. Why? Because testing can’t handle all use cases and may not have been able to test the application with the full production database. Therefore it’s important to keep an eye on the same metrics and alert in case database access patterns change. Furthermore, looking at the same metrics in testing and production environments makes life easier in a DevOps world as metrics and their meaning are familiar to everyone involved.

The following image shows a production dashboard that monitors the number of total SQL calls in comparison to the number of users on the system. These two numbers should correlate – even when you do a new deployment – otherwise you know that the new deployment has a regression as it is executing more statements than before:

In a Production Environment you want to monitor Database Access in correlation to load on your system. This dashboard shows both DB Count and Time in Correlation with Transaction Count
In a production environment you want to monitor Database Access in correlation to load on your system. This dashboard shows both DB Count and Time in Correlation with Transaction Count

If you have critical transactions such as Login, Search, etc. you also want to monitor these transactions individually to constantly verify their database access behavior:

For critical transactions such as Login, Search it is recommended to monitor the average number of database calls for these transactions. It shouldn’t go up with increasing load or a new deployment
For critical transactions such as Login, Search it is recommended to monitor the average number of database calls for these transactions. It shouldn’t go up with increasing load or a new deployment

What Does This Mean for You?

This is the second quality metric in our series of CD blog posts; Pay attention to the number of SQL executions per transaction as you proceed along your delivery pipeline. Here is the key takeaway for your specific role:

  • Developers: Understand the impact of introducing a new Data Access Layer (such as an OR Mapper), adding custom SQL calls or making calls to third party frameworks. There are tools available to provide this data in an on demand or automated scenario.
  • Performance Engineers: Make sure to test your applications in an as-close-as-possible to production environment and make sure to test it through the same interface as the end user will use, e.g.: a browser. That will make sure you test all components (including 3rd party components) and test it against a production database to capture data driven problems as explained above.
  • Production and Business: Not every use case can be tested – therefore it is important to have this type of monitoring in place. Report this data to your engineering team so that they learn about the impact of implementation changes as well as the impact of real production data vs data in test or staging.

The next metrics we are going to look into are # of Exception or # Log Messages written. Stay tuned and feel free to comment below with your own metrics.