Basic load testing is valuable, but it's important to move past simplistic efforts. Here are some ways to gain more accurate metrics from your load tests.
Load testing takes a lot of time and effort to set up correctly. There are many factors to plan for and implement, and some of them can be quite subtle.
There’s a tremendous amount of value in getting even a rudimentary load testing scenario up and running quickly. Immediate feedback on the state of your system can reap great rewards, but in many cases, it’s extremely important to quickly move beyond the initial basic load testing effort. These simplistic scenarios can present an incomplete—or worse, outright misleading—picture of how your system is behaving under stress.
Taking the time and effort to move beyond basic load testing is a crucial step to ensure that stakeholders and customers receive accurate metrics in order to make correct decisions.
Baseline Data
One of the most critical, often overlooked aspects of getting an accurate picture of your system’s performance is the data you use when testing your system. This is unfortunate, because the size, trend, and “shape” of the data has a tremendous impact on so many things across your entire application: UI rendering, processing in the business layer, and, of course, the data tier. (Note: “Shape of data” is a phrase I’ve come up with to describe things like trends and distribution of data. For example, a social networking platform may be extremely heavy in blog usage but have little wiki or media content.)
If you’re working with a business-intelligence or data-analytics system, it’s simply unrealistic to have only a clean or empty data set when checking your load. I’ll actually go a step further and say it’s downright irresponsible. You need your scenarios to validate against the realistic demands of processing months or years of data. Similarly, if you’re trying to profile an e-commerce site, you want a realistic set of products, reviews, customer records, etc., to comprise your working data set.
Getting ahold of or creating your data sets is an important task for which you’ve got to plan and dedicate time. Are you going to create your data, or are you going to use a real-world data set? Both options bring their own sets of challenges and constraints.
Real-World Data, Real-World Headaches
If you’re lucky, you may be able to get ahold of a real-world data set. There’s nothing better than using data that represents exactly how the system is used in the wild! In the past, I’ve reached out to customers with whom I’ve had great relationships. Using a “live” data set from a customer often means coming up with some scripts to sanitize the data. You want to ensure that you’re respecting and protecting the customers’ sensitive data, and sometimes this data may have potential legal liabilities attached to it.
If you’re sanitizing a real data set, you’ll need to ensure that you’re not changing the trend, size, or shape of the data. If you’re trying to eliminate potentially sensitive discussion threads from a company forum or mail list, you’ll want to ensure that you replace the discussions with text that’s similar in word count. The same example goes for other types of data, like documents, media files, etc.
You’ll also want to avoid changing dates around data-creation events, because significant impacts might happen across your system—e.g., a trend-analysis routine might run blisteringly fast when there’s only one date but fall apart when pointed at data distributed across several years.No Data? Make Your Own!
Perhaps you don’t have access to a good set of real-world data. That means you’ve got to roll up your sleeves and figure out how to create something useful. This can be extraordinarily daunting, particularly if you think you’re going to need huge amounts of data.
Make no bones about it: Rolling your own data will be time consuming. You’ll need to sit down with your entire team and figure out what scenarios you want to create data for, and you need to be extremely lean and realistic about what you’re going to bite off. If your system is new, do you really need to generate a data set simulating five or ten years’ worth of data? No! You don’t even know if your system will live that long, so why try to solve that sort of problem?
Brainstorm realistic data scenarios, and ensure that you’re covering all of the system’s use cases. I once worked on a data creation effort where we initially missed several small but important use cases that had a definite impact on the overall performance. Avoid this by getting multiple team members involved in the exercise, and run back through your list of features to ensure you’re covering everything.
A great source for ensuring you’re covering all those use cases is your functional automation suite, assuming you have one. Many UI-level automation suites are organized around functional areas, so you’ve got a great starting point for your coverage. Those same functional tests are also a good starting point for actually generating or creating your data. Reconfigure your suites so they don’t delete content they’re creating, then start looping through iterations and watch your data set grow. You’ll need to do some post-processing to adjust dates across a realistic spread, but getting an 80 percent solution is better than nothing.
If you don’t have functional tests in place already, then step back and look for the easiest way to create your content. Do you have web services in place that support content creation or stored procedures in your database? Some modest effort with your developer crew will give you some one-off tools capable of building you what you need to spin up your initial data.
Keep in mind that the tools you’re building for this effort only need to be simple, lightweight scripts, batch files, or tiny programs. Moreover, you will likely end up stringing together a series of small tools to do the job for you. Don’t invest huge amounts of time in this. Build just enough to get you moving, nothing more.
One Data Set Isn't Enough
If your system is large or complex enough, you’ll likely want several sets of data in different sizes and shapes. For example, perhaps you’re working with a platform that offers both forums and blogs. Having separate data sets skewed heavily in both directions, plus a balanced mix, can help inform you if you’re running into performance issues in different environments.
Above, I said it’s irresponsible to test with only an empty or nearly empty data set. Note that I specifically said “only.” Actually, an empty database—or whatever your initial default data set is—can give you some interesting results about your system, and it’s tremendously easy to set up!
Mix of Use Cases
Creating a realistic load test scenario means setting up more than just three or four use cases and throwing a thousand virtual users against those scripts. You need to plan ahead and create use cases that model actual system usage as closely as possible.
For existing systems, your web server logs are a great place to get some modeling information for this kind of activity. Your logs will let you know how users are accessing your system. Your load testing tools may even be able to generate load use cases directly from these logs, which is a huge benefit!
If you’re working with a new system without any history of real-world usage, then you’re going to need to go back to your brainstorming work with your team. Think of all the various system routes through which your users might traipse, and keep in mind that many users may start not at your site’s home page but instead at other areas of your site through external sources such as links from articles or search engines.I use an e-commerce sample starter kit for a popular content management system as a source for many of my demos. This starter kit is a wonderful example because it lets me model out a great many use cases, such as the following:
- An anonymous user navigates directly to an article on the site’s blog.
- An anonymous user navigates directly to an article on the site’s blog and leaves a comment.
- An existing user navigates to his account profile and instigates a support ticket.
- A new user hits the home page and browses to a number of different products while reading a bit about each.
- An existing user hits the home page, logs into his profile, browses to a product, adds it to his shopping cart, and checks out.
- An existing user hits the home page, browses to a product, adds it to his shopping cart, checks out, and is required to log on.
- An existing user hits the home page, browses to a product, adds it to his shopping cart but does not check out, and, instead, leaves his browser open for the rest of the day.
Note the mix of similar but subtly different use cases. Take it a step further and perhaps consider data driving the checkout process so that users are shipping to different locations with different shipping rates. Ensure that you’re modeling all types of credit transactions, too.
Spend time building an accurate, complex mix of your use cases. You’ll find your system behaves very differently in such situations, and this is critical information to have.
Monitoring Across Different Bandwidths
One focus of load testing is to find out where your system tips or fails—where it stops responding after you’ve thrown an extreme number of concurrent users at it. Ensuring that your site doesn’t fail under expected load is an obvious, critical situation to protect against.
A subtler, but just as important, situation to guard against is poor user experience. The Aberdeen Group’s often quoted study from 2008 points out that sites will lose 10 to 11 percent of users for every second of delay. This translates directly to lost revenue for many sites.
Good load testing teams have long understood the need to validate things like response time under load. However, there’s a next step to this critical set of data. Do you know how users’ experiences are when they’re behind different bandwidth pipes?
Load testing is generally done on a stable, solid network. The results you get for response times and other end-user experience metrics can vary dramatically from what a user will truly see at his home or office. Subtle and not-so-subtle performance differences show up when you start to gather metrics for user agents behind simulated slower links. The huge pages that loaded fairly responsively when grabbed over a 1 GB Ethernet LAN switch likely aren’t going to show up so well when viewed from a low-cost neighborhood DSL endpoint.
Finding tools that will support these sorts of measurements will give you a much greater level of awareness of your site’s behavior for all your users, not just the optimal cases.
Don't Settle for Simplistic Load Scenarios
Getting started with load testing is an important step for your projects. However, speeding through your load efforts can leave your team and stakeholders with an inaccurate picture of your site’s true performance under load.
Take the time to think through what sorts of baseline data sets you need to work with, and spend the time to find or build sets that match your needs. Ensure that you’re working with a realistic mix of use cases for your load scenarios. Make sure that you’re building up a mix of use cases, roles, and user actions. Finally, think about how to validate end-user experiences as those users actually see them, and not just about metrics that reflect a pristine environment in your network center.
Nailing down these challenges will ensure that you’re giving your team and stakeholders the best possible information on your site’s performance health.
User Comments
I agree but not all the way.
First: after reading this article is not clear to me if the author sees a difference between Performance, Load and Stress testing.
Actually, looking back at his name, I am sure the author knows that difference very well, but the message in the article it is confusing: In the second paragraph reads about Simplistic Load scenarios that are not giving the correct picture on how the system is behaving under stress. Well, let's decide if we are talking Load test or Stress test!
Another thing that I do not agree with is the general idea (in this article) that is better to eliminate simple test scenarios.
I consider this approach not practical. And the article is misleading here.
From my seat this article seems to promote only a complex of concerted scenarios that is supposed to "imitate" the life. Demiurgic task!
I consider this way of testing very useful indeed but not enough: I wanna know the maximum number of users that can be logged in at once. Or I wanna know maximum numbers of products that can be in a shopping cart. Or I wanna know how many users can hit the logging page in 1 second without errors. I can go like that for minutes but what I want to convey is: Simple scenarios are important on defining the performances. They give the tester the possibility to "divide and conquer" and to the developer the possibility to evaluate improvements of specific functional tasks.
Your Complex of Scenarios is giving nothing of these.
Think of a new model for a car: how is it tested for Load? Is it driven hard for 1 hour in different roads, pavements, driving styles, weather conditions, traffic conditions, etc? Of course that is done but beside that there are a lot more tests on Load (performance eventually). I wrote "car" because the industry is very mature, competitive, dynamic and has a solid engineering foundation (which is missing to Software Industry).
Shortly: I consider the article useful because emphasizes on side of the truth. I consider the article poisonous for the same reason.
Read it and be aware of its shortcomings.