This paper discusses the fundamental approaches to establishing a test strategy and the varying levels of the test plan.
Test planning is one of the keys to successful software testing, yet it is frequently omitted because of time constraints, lack of training or cultural bias. Indeed, a survey taken at a recent STAR testing conference showed that only 81% of the companies participating in the survey completed test plans (and our experience has shown that many of those 81% are calling the testing schedule a plan). Testing without a plan is analogous to developing software without a project plan and generally occurs for the same reason: pressure to begin coding (or in this case testing) as soon as possible (i.e. why isn't Sam coding yet?) Many organizations measure progress in development by lines of code delivered and in testing by the number of test cases run. While these can be valuable measures, they don't recognize planning as a worthwhile activity.
Test planning can and should occur at several levels (For our European readers, the word level is used rather than stage). The first plan to consider is the Master Test Plan. The purpose of the Master Test Plan is to orchestrate testing at all levels (unit, integration, system, acceptance, beta, etc.). The Master Test Plan is to testing what the Project Plan is to the entire development effort. In fact, the Master Test Plan can be a separate document or could be considered part of the project plan. Test managers should think of the Master Test Plan as their major communication and control device. Test planning is a process (that ultimately leads to a document) that allows all parties involved in the testing process to proactively decide what the important issues are in testing and how to best deal with them. The goal of test planning is not to create a long list of test cases, but rather to deal with the important issues of testing strategy, resource utilization, responsibilities, risks, and priorities. In fact, test cases don't even belong in the test plan, but rather should be placed in test design specifications.
In test planning, the process is ultimately more important than the product. Discussing issues of what and how to test early in the project lifecycle can save a lot of time, money, and disagreement later. I once had a consulting assignment at a major American company where I was supposed to help them create their first ever Master Test Plan. Following up with the client a few months later, the project manager told me that the creation of the Master Test Plan had contributed significantly to the success of the project, but unfortunately they hadn't really followed the plan or kept it up to date. "Let me get this straight," I responded. "You didn't use the plan, but you felt that it was a major contributor to your success…can you explain that?" The project manager told me that when they began to fall behind they began to dispense with much of the project documentation, including the test plan (sound familiar?). But because they had created the plan early in the project lifecycle, many testing issues were raised that normally were not considered until too late to take action. The planning process had also heightened the awareness of the importance of testing to all of the project participants. Now I believe that keeping test plans up to date is important, so that is not the purpose of telling you this story. Rather, I'm trying to stress the importance of the testing process not just the document.
In addition to the Master Test Plan, it may be necessary to create detailed or Level-specific Test Plans. On a larger project, it is often worthwhile to create an Acceptance Test Plan, System Test Plan, Integration Test Plan, Unit Test Plan, etc. Smaller projects (e.g., projects with smaller scope or fewer participants or organizations) may need only one test plan that will cover all levels of test. Deciding the number and scope of test planning required should be one of the first strategy decisions made in test planning. It is beyond the scope of this article to discuss all of the various issues at each level of test planning; for the most part the major considerations are the same, but differ in scope. In fact, it is normally desirable to use the same basic template for the detailed test plans that you use for the Master Test Plan. Generally there is only one Unit Test Plan or maybe one Unit Test Plan per major feature or subset, but don’t try to write a Unit Test Plan for each unit! (And remember test plans do not contain test cases.)
Who is my audience?
Obviously, the first question you must ask yourself when creating a test plan is "Who is my audience?" The audience for a Unit Test Plan is quite different than the audience for an Acceptance Test Plan or a Master Test Plan—so the wording, use of acronyms, technical terms and jargons should be adjusted accordingly. Also keep in mind that various audiences have different tolerances for what they will and will not read. Executives may not be willing to read an entire Master Test Plan if it is fifty pages long, so you may have to consider an executive summary. Come to think about it, you might want to avoid making the plan prohibitively long or no one will read (and use) it. If your plan is too long, it may be necessary to break it into several plans of reduced scope (around, for example, subsystems or functionality). Sometimes the size of plans can be kept in check by the judicious use of references. But please proceed carefully—most people don’t really want to gather a stack of documents just so they can read a single plan.
When should you begin the various test plans?
Generally it is desirable to begin the Master Test Plan at the same time the Requirements Documents and the Project Plan are being developed. Test planning can—and should—have a significant impact on the content of the Project Plan. Acceptance Test Plans can be started as soon as the requirement definition process has begun. I have one client that actually includes the Acceptance Test Plan as part of the Requirements Specification! Similarly, the System, Integration, and Unit Test Plans should be started as early as is possible. Often, test planners get frustrated when they begin their planning process early and find out that all of the information is not yet available or in a state of flux. Experienced test planners have learned to use the notation “TBD” (to be determined) when they come to a part of the plan that is not yet known. This is important—because it allows the planners to see where to focus their efforts, and because it highlights what is yet to be done. It is true that plans that are written early will probably have to be changed during the course of the software development and testing. That process is also important, because it records the progress of the testing and helps planners become more proficient on future projects.
It is important that each organization have a standard template for the test plans. There are several available, including the U.S. military’s 2167A or 498 and the IEEE/ANSI standard for software test documentation. I believe that either of these is a good place to start for your own customized template. You may find that the IEEE (or some other test plan template) meets your needs exactly as it is (i.e., "off the shelf.") But please feel free to customize the template you begin with. It is likely that over time you will find that some of the required items on your template are always left blank or are always the same. If they're always blank, why bother with those fields? And if the wording in certain sections is constant from plan to plan, then you must first decide if you've really addressed the issue. If you’re confident that you have, then maybe that field should become part of your standard methodology and removed from the plan. Remember, a plan considers the unique situation of a given project or release; it is not itself a standard that works for every project.
In the section below, I am going to explain many of the issues that I believe should be considered in completing a test plan. The template that I’m going to describe is a slightly modified version of the IEEE/ANSI standard for test planning. This template is useful for creating any kind of test plan: Master, Acceptance, System, Integration, Unit, or whatever you call the levels in your organization.
1.0 Test Plan Identifier
In order to keep track of the most current version of your test plan, you will want to assign it an identifying number. If you have a standard documentation control system in your organization, then assigning numbers is second nature to you. (Hint: when I’m auditing the testing practices of an organization, I always check for the test plan identifier. If there isn't one, that usually means that the plan was created but never changed—and probably never used.)
2.0 References
This could include the requirements document, project plan, testing standards, QA plan, configuration management plan, design documents, company standards, etc
3.0 Glossary
Used to define any terms and acronyms in the document that might not be understood.
4.0 Introduction
Usually I look for two main things in the introduction: (1) a basic description of the project or release including key features, history, etc., and (2) an introduction to the plan that describes the scope of the plan.
5.0 Test Items
This section describes programmatically what is to be tested. If this is a Master Test Plan, this section might talk in very broad terms: "version 2.2 of the accounting software," "version 1.2 of the users manual," or "version 4.5 of the requirements spec." If this is an Integration or Unit Test Plan, this section might actually list the programs to be tested, if known. This section should usually be completed in collaboration with the configuration or library manager.
6.0 Software Risk Issues
The purpose of discussing software risk is to determine what the primary focus of testing should be. Generally speaking, most organizations find that their resources are inadequate to test everything in a given release. Outlining software risks helps the testers prioritize what to test, and allows them to concentrate on those areas that are likely to fail or have a large impact on the customer if they do fail. Organizations that work on safety-critical software can usually use the information from their safety and hazard analysis here. However I have found that in most other companies no attempt is made to verbalize software risks in any fashion. If your company does not currently do any type of risk analysis, I recommend that you start simply. Try a brainstorming session among a small group of users, developers and testers to find out what their concerns are. I always start the session with the words "what worries you...." because I find that the word risk can be intimidating to some people. Some examples of software risks include: interfaces to other systems, features that handle large sums of money, features that affect many (or a few very important) customers, highly complex software, modules with a history of defects (from a defect analysis), modules with many or complicated changes, security, performance and reliability issues, and features that are difficult to change or test. You can see that the risk analysis team needs users to judge the impact of failure on their work; as well as developers and testers to analyze the technical risk. The list of software risks should have a direct affect on what you test and in what order you test. Risk analysis is hard, especially the first time you try it—but you will get better, and it's worth the effort. Often it’s a lot more important what you test than how much you test. (See 7.0 Features to Be Tested for some more hints on basic risk analysis.)
7.0 Features to Test
This is a listing of what will be tested from the user or customer point of view (as opposed to test items, which were a measure of what to test from the viewpoint of the developer or library manager). For example if you were testing an Automated Teller Machine (ATM), features to be tested might include Password Validation, Withdraw Money, Deposit, Transfer Funds, Audit Reports and so forth. The features to be tested might be much more detailed for lower levels of test.
This section can also be used to help focus on software risks (see Section 6.0). One type of risk analysis that I have used and recommended to some of my clients involves analyzing the relative risk of each feature. Basically it involves assigning a value to each feature for likelihood of failure, as well as a value for impact of failure. You may want to use High, Medium and Low as a way of rating the likelihood and impact of failure of each feature. It will become very obvious that the features that have HH (denoting high likelihood and high impact) will have the greatest software risk, while those with a rating of LL will have the lowest. Somewhere in the middle we have ratings for MM, HL, LH, HM, etc.
You could argue that this method is subjective and you might be right (actually the rating might be objective if they’re based on number of customers, dollars, software complexity, etc.). It is not necessary, however, to rank every feature from 1 (the highest risk) to X (the lowest risk). What is important is to break the risk of the features into a few categories (e.g., High, Medium and Low). This allows us to focus our testing on the High Risk features first, then the Medium Risk features and leave the Low Risk features to last. A finer measure is not really required for most projects. An added benefit of assigning a risk rating to each feature is that it can help determine which low risk features to move to 8.0 Features Not to Be Tested when your project begins to get behind schedule
8.0 Features Not to Be Tested
This section is used to record any features that will not be tested and why. There are many reasons that a particular feature might not be tested (e.g., it wasn't changed, it is not yet available for use, it has a good track record, etc.); but whatever the reason a feature is listed in this section, it all boils down to relatively low risk. Even features that are to be shipped but not yet "turned on" and available for use pose at least a certain degree of risk, especially if no testing is done on them. This section will certainly raise a few eyebrows among managers and users (many who cannot imagine consciously deciding not to test a feature), so be careful to document the reason you decided not to test a particular feature. These same managers and users, however, will often approve a schedule that does not possibly allow enough time to test everything. This section is about intelligently choosing what not to test (i.e., low risk features), rather than just running out of time and not testing whatever was left on the ship date.
Some companies that make safety-critical systems or have a corporate culture that “requires” every feature to be tested will have a hard time politically listing any features in this section. If every feature is really tested, fine—but if resources do not allow that degree of effort, using the Features Not to Be Tested Section actually helps to reduce risk.
One other item to note is that this section may grow if projects fall behind schedule. If the risk assessment (See sections 6 and 7) identifies each feature by risk (for example H,M and L) it is much easier to decide which additional features pose the least risk if moved from Section 7 to Section 8. Of course there are other options besides reducing testing when a project falls behind schedule and these will be discussed in Section 18, Planning Risks.
9.0 Approach
Since this section is the heart of the test plan, some of my clients choose to label it Strategy rather than Approach. The Approach should contain a description of how testing will be done (approach) and discuss any issues that have a major impact on the success of testing and ultimately of the project (strategy). For a Master Test Plan, the approach to be taken for each level should be discussed including the entrance and exit criteria from one level to another.
For example: System Testing will take place in the Test Labs in our London Office. The Testing effort will be under the direction of the London VV&T team, with support from the Development staff and users from our New York office. An extract of production data from an entire month will be used for the entire testing effort. Test Plans, Test Design Specs, and Test Case Specs will be developed using the IEEE/ANSI guidelines. All tests will be captured using SQA Robot for subsequent regression testing. Tests will be designed and run to test all features listed in Section 8 of the System Test Plan. Additionally, testing will be done in concert with our Paris office to test the billing interface. Performance, Security, Load, Reliability and Usability Testing will be included as part of the System Test. Performance Testing will begin as soon as the system has achieved stability. All user documentation will be tested in the latter part of the System Test. The System Test team will assist the Acceptance Test team in testing the installation procedures. Before bug fixes are reintroduced into the test system, they must first successfully pass Unit Test, and—if necessary—Integration Test. Weekly status meetings will be held to discuss any issues and revisions to the System Test Plan as required.
Exit Criteria from System Test:
All test cases must be documented and run.
90% of all test cases must pass.
All test cases dealing with the Billing Function must pass.
There can be no class 2 or 3 defects unfixed
Code coverage must equal at least 80% (including integration and unit
testing) .
Another topic that should generally be discussed in the Approach is how configuration management will be handled during test (however it is possible that this could be handled in a document of its own in some companies). Configuration management in this context includes change management as well as the decision-making process used to prioritize bugs. Change management is important because it is critical to keep track of the version of the software and related documents that are being tested. There have been many woeful tales of companies that have actually shipped the wrong (untested) version of the software. Equally important is the process for reviewing, prioritizing, fixing, and retesting of bugs. The test environment in some companies is controlled by the developers, which can be very problematic for test groups. As a rule, programmers want to fix every bug immediately. It's as though the programmers feel that if they can fix the bug quickly enough it didn't happen! Testers on the other hand, are famous for saying that "testing a spec is like walking on water; it helps if it's frozen." Obviously both of the extremes are counterproductive. If every bug fix is re-implemented immediately, the testers would never do anything but Regression Testing. Conversely, if the code is frozen prematurely, eventually the tests will become unrealistic. The key is to agree on a process for reviewing, fixing and implementing bugs back into the test environment. This process may be very informal during Unit and Integration Test, but will probably need to be much more rigid at higher levels of test. Indeed, I occasionally recommend that my clients use Acceptance Tests as a way of validating the software configuration management process. A formal or informal change control board can be set up to handle this chore. Typically, the board should include members from the user community, developers, and testers. They will determine the severity of the bug, approximate cost to fix and test, and ultimately the priority for fixing and re-implementing.
Another topic often described in the approach is metrics. Since metrics collection and validation can be a significant overhead, it is necessary to discuss what metrics will be collected, what they will be used for, and how they will be validated. Imposing a cumbersome metrics program on a development team can cause resentment and lead to undesirable side-effects, such as unreliable measurements. It is not my purpose in this article to discuss the pros and cons of various types of metrics and programs, but to merely point out that they should be discussed in the approach of the Master Test Plan.
Another strategy issue that should probably be addressed in the test plan is the use of tools and automation. Testing tools can be a boon to the development and testing staff, but they can also spell disaster if their use is not planned. Using some types of tools can actually require more time to develop, implement and run a test set the first time than there would be if the tests were run manually. Using tools, however, may save time during Regression Testing—and other types of tools can pay time dividends from the very beginning. Again, it is not my desire to discuss test tools here, only to point out that their use needs to planned (including training!).
Other topics that might show up in the approach include: how to handle multiple hardware configurations, multi-level security, beta testing, how to set up the test environment and how often to update it, use of contractual support, etc. The bottom line is this: Anything that has a significant impact on the effectiveness or cost of testing is a candidate for the approach section.
10.0 Item Pass/Fail Criteria
This section describes the pass/fail criteria for each of the items described in Section 5. Just as every test case needs an expected result, each test item needs to have an expected result. Typically pass/fail criteria are expressed in terms of test cases passed/failed, number, type, severity and location of bugs, usability, reliability, and/or stability. The exact criteria used will vary from level to level and company to company. If you’ve never tried to do this before, you may find it a little frustrating the first time or two; but trying to specify "what is good enough" in advance can really help crystallize the thinking of the various test planners and reduce contention later. If the software developer is a contractor, this section can even have legal ramifications.
11.0 Suspension Criteria and Resumption Criteria
The purpose of this section is to identify any conditions that warrant a temporary suspension of testing. Because test execution time is often so harried, testers have a tendency to surge forward no matter what happens. Unfortunately, this can often lead to additional work and a great deal of frustration. For example, if a group is testing some kind of communications network or switch, there may come a time when it is no longer useful to continue testing a particular interface if the protocol to be used is undefined or in flux. Sometimes metrics are established to flag a condition that warrants suspending testing. For example if a certain predefined number of total defects or defects of a certain severity are encountered testing may be halted—until a determination can be made whether to redesign part of the system, try an alternate approach, etc. Sometimes suspension criteria is displayed in the form of a Gant chart.
12.0 Test Deliverables
This is a listing of all of the documents, tools, and other elements that are to be developed and maintained in support of the testing effort. Examples include: Test Plans, Test Design Specs, Test Cases, Custom Tools, Defect Reports, Test Summary Reports, and Simulators. One thing that is not a test deliverable is the software to be tested; remember that is listed under Test Items.
13.0 Remaining Testing Tasks
This section is called testing tasks in the IEEE. Since I usually put the testing tasks in a matrix under the Responsibilities section, I renamed the field Remaining Testing Tasks and use it in one of the following two ways:
- When a project is a multiphase project, it is useful to try to briefly describe what will be tested in the future releases.
- When a third party is used to develop the software (outsourcing) this section can be used to describe the extent of their testing (often unit and/or integration testing).
14.0 Environmental Needs
Environmental needs include:
Hardware configuration. An attempt should be made to make the platform as similar to the real world system as possible. If the system is destined to be run on multiple platforms, a decision must be made whether to replicate all of these configurations, or to replicate only targeted configurations (e.g., the riskiest, the most common, etc.). When you're determining the hardware configuration, don't forget the system software as well.
Data. Again, it is necessary to identify where the data will come from to populate the test database. Choices might include production data, purchased data, user-supplied data, generated data, and simulators. It will also be necessary to determine how to validate the data—it is false to assume that even production data is totally accurate. We must also access the fragility of the data (so we know how often to update it!).
Interfaces. My students no doubt get tired of hearing me say "interfaces are risky"; but indeed they are. When planning the test environment, it is very important to determine and define all interfaces. Occasionally the systems that we must interface with already exist; in other instances, they may not yet be ready and all we have to work with is a design spec or some type of protocol. If the interface is not already in existence, building a realistic simulator may be part of our testing job.
Facilities, Publications, Security Access, etc. This may seem trivial, but we must insure that we have somewhere to test, appropriate security clearance, and so forth.
15.0 Staffing and Training Needs
The actual number of staff required is of course dependent on the scope of the project, schedule etc. What we want to do in this section is describe the number of people required and what skills they need. You may merely want to say that you need fifteen journeymen testers and five apprentice testers. Often, however, you will have to be more specific. It is certainly acceptable to state that you need a special person: "We must have Jane Smith to help establish a realistic test environment."
Examples of training needs might include learning about: how to use a tool, testing methodologies, interfacing systems, management systems such as defect tracking, configuration management, basic business knowledge (related to the system under test), etc.
16. Responsibilities
I like to include a matrix in this section that quickly shows major responsibilities such as establishment of the test environment, configuration management, unit testing and so forth. I believe that it is best to specify the responsible parties by name or by organization.
17. Schedule
The schedule should be built around the milestones contained in the Project Plan such as delivery dates of various documents and modules, availability of resources, and interfaces. Then it will be necessary to add all of the testing milestones. These testing milestones will differ in level of detail dependent upon the level of the test plan being created. In a Master Test Plan, milestones will be built around major events such as requirements and design reviews, code delivery, completion of users manuals, and availability of interfaces. In a Unit Test Plan, most of the milestones will be based on the completion of various Programming Specs and Units
Initially it may be necessary to build a generic schedule without calendar dates; this will identify the time required for various tasks and dependencies without specifying particular start and finish dates. Normally the schedule will be portrayed graphically using a Gant chart to show dependencies.
It is beyond the scope of this article to discuss estimating in any great detail. But if we are ever going to gain credibility in the software development arena, we must get better at estimating time and resources. It is important that the schedule reflect how the estimates for the milestones were determined. For example, when the time schedule is very aggressive, estimating becomes even more critical—so that the planning risks and contingencies and test priorities can be specified. Recording schedules based on estimates also provides the test manager with an audit trail of how the estimates did and did not come to pass, and forms the basis for better estimating in the future.
18. Planning Risks and Contingencies
Many organizations have made a big show of announcing their commitment to quality. We’ve had quality circles, quality management, total quality management, and who knows what else. Unfortunately, in the software world many of these same organizations have demonstrated that their only true commitment is to the schedule.
Most of us have taken part in projects where the schedule is at best ambitious and at worst impossible. Once an implementation date is set, it is often considered sacred. Customers may have been promised a product on a certain date; management credibility is on the line; corporate reputation is at stake; or our competitors may be breathing down our neck. At the same time, as an organization we may have stretched our resources to the limit. It is not my purpose to address the many reasons why we so often find ourselves in this unenviable spot. Rather I would like to talk about what we can do about it.
Unfortunately all of the above choices seem bad—and all too often Management decides that they are all unacceptable. If Management does not make proactive decisions in situations like this, the technical staff will often end up making the choices by default. Initially more resources will be added, typically in the form of overtime. If this does not solve the problem, the team will begin to take shortcuts—eliminating a document here, a review there, or eliminating an entire set of tests. Not surprisingly, the quality suffers. If the project is still in jeopardy, functionality that is not absolutely essential will be rescheduled for a later release, or the date may be slipped. Eventually, then, the new target date may be met, and a watered-down system of poor quality is delivered to the customer late by a very frustrated development team. Sound familiar? The purpose of this section is to help us make intelligent, informed decisions. Almost every project team can identify the planning risks that worry them: late requirements, test environment problems, late delivery of software, etc. Our goal is to decide in advance what to do if one of these planning risks comes true. Our choices were outlined above: reduce the scope, delay implementation, add resources, or reduce quality processes. If the choice is made to reduce quality processes (such as testing) it becomes critical that we know what testing to reduce. This plan helped to identify the software risks (see Section 6) and these risks should have helped us to focus and prioritize our testing to reduce them. Naturally, we don't want to learn at the last moment that we have to reduce testing...and then learn that it is the critical components of the system that have not yet been tested! I also mentioned that we can use risk to prioritize "Features to Be Tested (See Section 7)" and to help decide what will go into the section “Features Not to be Tested (See section 8)." It should be apparent at this point that Planning Risks, Software Risks, Features to Be Tested, Features Not to Be Tested, and indeed the whole plan is built around this concept of using risk to help prioritize our entire testing effort. 19. Approvals What we really want to avoid, however, is the situation in which we seek the appropriate signatures after the plan has been completed. If you do get the various parties to sign at that time all you have is their autograph—which is fine if they ever become famous and you’re an autograph collector. Our goal, instead, is to get agreement and commitment, which means that the approvers should have been involved in the creation and/or review of the plan all during its development. It is part of your challenge as the test planner to determine how to involve all of the approvers in the test planning process. Test planning is a lot of work and can be time-consuming. If you're under the gun to get the next release out the door you may argue that you can't afford to spend time creating a test plan. I hope that you will agree that you can't afford to begin testing without a good test plan.
I believe that the approvers should be the person or persons that can say that the software is ready to move to the next stage. For example, the approver on a Unit Test Plan might be the development manager. The approvers on a System Test Plan might be the person in charge of the System Test and whoever is going to receive the product next (which may be the customer, if they are going to be doing the Acceptance Testing). If this is a Master Test Plan, there may be many approvers: Developers, Testers, Customers, QA, Configuration Management, etc.