Have you been asked to look into load-testing your software? Have you or are you about to buy one of these “Testing for Dummies” books to get a kick-start in that domain? This blog explains some of the basic concepts, challenges, terminologies and approaches for load testing software applications. It is a summary of the work that I’ve done in my past (used to work for a Load Testing company) and it also highlights how our current customers deal with their load testing needs.

Why Web- and Load-Testing?

There are different questions you should be able to answer before deploying a new application

  • Does my application still work correctly when there is more than one user on the system?
  • Does my application still respond fast enough with a growing number of concurrent users?
  • How does my application scale with a growing number of concurrent users?
  • Where are the bottlenecks in the application and its architecture?
  • How much load does my application need to handle?
  • How much load can my application handle?
  • What is the hardware requirement to handle the required load?

To answer these questions it is necessary to perform Web- and Load Tests. I refer to Web Tests as tests that test a single use case against your application, e.g.: Logging in, searching for products, purchasing items, logging out. You should have several of these use cases that can be derived from the requirements of the application you are building. Testing these use cases verifies the functionality of the software.

The goal for Load Testing is to verify if the functionality as well as the performance of the individual use cases is still given when many simulated end-users execute these common uses cases at the same time. A simulated end user is referred to as a Virtual User. A use case is often referred to as a Test Case. The definition of how many and which types of uses cases should be executed by how many virtual users is called the Workload.

Why do Load Testing Projects fail?

I’ve seen several issues with Load Testing projects that sometimes brought them to fail or delayed the whole project. All these problems can be avoided upfront when you do your homework

Application is not ready for testing

Many times I’ve seen shaking heads when the test team was ready to execute 1000 users against the application for the first time against the application. The shaking heads started when the testing tool reported errors with 2 or 5 concurrent users and response times going through the roof with only a fraction of the desired load.

In all these cases the application was not ready for testing as some basic problems in the implementation or in the architecture did not allow any high load on the system. These problems can be avoided early on during development by enforcing more testing and more time to think about the correct architecture for the performance requirements on the application. The result of these situations is wasted time by the testers, the test software and hardware.

No real life test use cases and test data

One of the trickiest things in testing is to come up with real world test scenarios. What is the typical user going to do? How does he click through the pages? How much time is he going to spend between pages (that’s what we call Think Time)? What is the search criterion that is going to be used to find a product?

If the real end user is going to do work with your application in a totally different way than you test you are as good as with no testing at all. It is important to spend enough time to think about the real use case scenarios. Talk with existing users – analyze access logs or other captured data from your current system or previous versions of the product.

Real test data is as important as real test cases. If you only test with a fraction of the data you have in production you are not getting accurate performance values and are not going to find problems related to the data access layer (which has proven to be one of the top problems in every application). So spend enough time to get quality real life data in the expected real life volume.

Problems with the Load Testing Infrastructure

Hardware for testing becomes cheaper. Still it is important to understand the requirements you have on your test infrastructure. Make sure you have enough network bandwidth to handle the load. You don’t want to get misleading response times because your network is congested.

The load generators must also be able to handle the number of simulated users. Every testing tool requires CPU and Memory for each simulated user. If the machine cannot handle the number of virtual users you put on them your results are basically useless. Many tools have built in monitors to alert you when the machines get overloaded. They also suggest how to best distribute the load on your load generators to not run into problems.

Not enough data was captured during the load test

What is the best load test good for if the only thing it tells you is the actual response times of your pages under a certain load? What do you do next if you want to find out why the search didn’t respond as fast as it should? You need additional data for root cause analysis.

Basic performance counters like CPU, Memory, Network, I/O can be captured automatically by most load testing tools. This already gives you an indication about where your problem might be. Application specific performance counters or log files are the next step to get closer to root cause – but still requires a lot of manual work and most often requires test re-runs in order to refine data capturing.

Using an Application Performance Management Solution for your load tests takes out the guess work and eliminates extra test runs and manual log file analysis. Read about how Novell increased their testing throughput by 2x/3x.

What are the main reasons for bad performance?

The following illustration highlights the most common performance problems that I’ve seen in applications:

Typical reasons of performance problems
Typical reasons of performance problems

For more details on actual problems check out the blog posts Alois did regarding Application Performance

What shall be tested

We already talked about the importance of use case scenarios and the test data. When we take the use case of a shopping transaction then we shall not test the transaction several thousand times and always search for and purchase the same product. Neither do we want all our simulated Virtual Users to wait exactly the same time after the search result is displayed till the Put Into Cart button is clicked.

In order to get a more realistic test you have to randomize your input data. Use a database that contains all possible combinations of search queries or be creative in your test script and randomly generate meaningful random input values. Many testing tools provide a feature to data drive a test case. This enables you to re-use the same test case and execute it with different input for different virtual users. Important: Make sure the tool provides you with the ability to a) run the same sequence of random inputs again (in order to compare test runs) and b) to have the ability to not use the same random value twice (sometimes necessary depending on your use cases)

Think Times: finding the correct time a simulated user should wait between individual test steps is a science on its own. In order to simulate a real end user the test case must also simulate the time a real user waits (or thinks) between two test steps. When you click on one link it takes you a while to click on the next link – you may read some information on the page or it takes you a while to find the next link. Simulating Think Times MUST BE part of the load testing tool’s feature set.

I often ran into the situation that Think Time was left out on purpose – why? In order to run more load against the servers. While this is great for stress testing (how many transactions per second can we handle?) – it is not a realistic test. Having less virtual users that execute more transactions per second is different to having more virtual users with fewer transactions per seconds. Why? There are multiple reasons, e.g.: less network connections handled by the server, less memory consumption for active user sessions, less database connections, …

Workload: The workload defines how many virtual users are executing which test cases over which period of time. Workloads come in different flavours as they also serve a different purpose. An Increasing Workload is used to start with a lower amount of virtual users and increasing them step by step over time. This helps to identify how the application scales and where it breaks. A Steady-State Workload holds a certain load over time. This is great to identify if performance is steady or whether constant load harms the application, e.g.: memory or connection leaks. A Goal-Based Workload allows you to define how many transactions should be handled by the application in a certain timeframe, e.g.: 10000 per hour. This is great to verify if you meet your business goals. A so called All-Day Workload (a term taken from SilkPerformer) allows you to simulate the load pattern of a typical work day. Throughout the day you have different number of users that perform different tasks at certain times, e.g.: login at 8 am – nothing during lunch break – lots of final tasks at the end of the day. This workload model allows you to model these patterns and verify if your application can handle a typical work day load.

What results to collect and what they tell us

The load testing tools usually provide information like transaction response time, page response times, # of errors, # of virtual users, transmitted bytes, …

These values are all great but they do not help you a lot when you try to find the root cause of a slow page time. For that you need to collect additional performance counters from across your application infrastructure. This includes CPU, Memory, I/O, Network, Threads/Handles of your application servers. Specific performance counters of your database, web server and load balancer. It should also include application specific counters exposed via custom performance counters or JMX to get more insight into what is going on within the application code, e.g.: # of processed orders, …

The reason for degrading page performance can be somewhere in the chain of all involved components/services that fulfil the page request. An exhausted database connection pool can cause many end users to wait for their request to be handled. Memory issues in one transaction can cause an expensive Garbage Collector run that blocks all other currently active transactions. A congested network between your application servers can cause timeouts and errors. Non-optimized SQL Statements can cause the database to max out in CPU Usage. These are all reasons for bad application performance. The symptoms can be seen by monitoring all sorts of performance counters. Having those performance counters on hand and correlating them with each other can give you a better understanding about why performance is bad.

Load Testing Results
Load Testing Results

Even though load testing tools got far with collecting additional performance counters they still do not give you enough information for rapid root cause analysis. In order to do that I encourage you to look into Application Performance Management as mentioned in the beginning. With Application Performance Management you can really unleash the power of load testing by not only getting your tests done right but also finding the root cause of those problems that have been found.

What tools are out there?

Tools out there are growing daily. There are the commercial test tools like HP Mercury, MicroFocus/Borland SilkPerformer, Microsoft Visual Studio for Testers,  IBM/Rational Performance Tester, iTKO Lisa, Compuware QALoad, Neotys … Besides the commercial tools there is an ever growing list of Open Source tools like JMeter, Selenium, PushToTest, WebLoad, … that provide great testing capabilities for free.

There is also a list of growing load testing services like SOASTA, Keynote, BrowserMob, Load Impact, Load Storm, … which eliminate the need to build up a test center as the load gets generated from machines on the internet (or Cloud).

My Tool Requirements

The tool of choice is not an easy one. It obviously depends on whether you want to spend some money or not to go with a commercial solution. The popularity of open source solutions definitely shows that these tools do a pretty good job. Often these tools evolved out of the need to automate testing in smaller test teams. Therefore most of these tools can easily be integrated into your test process or continuous integration.

The tool choice in the end depends on your requirements. Some key requirements for me are easy script creating and manipulation. Record and Replay alone doesn’t do the trick – it is a good start but I’ve hardly seen a case where you could run with the recorded script. Having a scripting language or some way to modify/data drive the test script is a requirement. The next big item is script maintainability. Most often the test script that you write today won’t work tomorrow because the application has changed. Rather than re-recording all the tests all the time you should look for a tool that makes it easy to adapt scripts for change in the application. And then – as mentioned earlier – it is about integrating these tools with your existing process to automate test execution and analysis as much as possible.

In the beginning I talked about Web- and Load-Testing. Web-Testing for me is testing the application to verify its functionality by executing my Use Cases with a single user for a every build or code change that I do. A test tool should therefore be able to run these tests in a single user mode without requiring expensive setup steps. Ideally I want to automate test execution (how often have I mentioned automation yet? 🙂 ) of these web tests as part of every build so that I know if my software is still working according to specification. If these tests all pass I can take my test cases and run a real load test to also verify performance. The requirement therefore is a tool that can re-use the same tests as functional web tests as well as performance tests and all that with an automation interface in place so that I do not have to click the start button manually for every build.

Load Testing in action – how to do it?

I am doing a joint webinar with PushToTest next week where we are going to show the life cycle of test from functional to load to root cause analysis. If you are interested check it out – it is for free: Automate Testing & Root Cause Analysis with PushToTest & dynaTrace

I will also sum up the steps for successful web- and load-testing after the webinar in a separate blog entry. In order to make that a success I ask you for your feedback on load-testing. Also feel free to ask questions that you have on this topic and I try to address them as well.

More on Load Testing

We just published a new White Paper about how to improve performance analysis and effectively do load testing. It discusses the challenges in load testing, how to address these challenges and how to make the load testing process more efficient and how to increase throughput of test centers.