Testing is such an integral part of our software projects that we often don't stop to think about why we do it. We must do it. What else is there to know? It is obvious that software that has not been tested is unready for deployment. As painful experience has taught us, testing does not guarantee that the software is fit to deploy. Even rigorously tested software may still have hidden fatal flaws.
If zero defects is an unreasonable goal of testing activities, what should we be trying to achieve? Since we cannot eliminate the risk of serious defects in our deployed products, we must manage that risk. Testing is an important tool for managing and mitigating the risk of undetected defects in our products and is, at its heart, a risk management activity.
There Will Always be Undetected Defects
We have all seen it: the system was developed and deployed years ago. It has been used daily for those intervening years, but then one day, it fails. Of course, you suspect the most recent change that was made, but there are times when the problem has been there all along. Somehow, all of the testing and use to date had not uncovered its presence. How could this happen?
Our software systems have always been complex and their complexity continues to grow. This complexity means that there are millions (or billions) of paths through the code mixed with potential combinations of inputs and conditions. Not only does this mean that we can't do 100% testing, it also highlights the futility of believing that testing is primarily about removing defects.
Managing Risk: a Primer
Risk has two dimensions: probability and impact. The probability of a risk refers to how likely it is that the risk will be realized. The impact of a risk refers to the degree to which that risk will affect the concerned parties if it is realized (regardless of how likely that is).
Managing risk involves understanding those two dimensions and taking action based upon them. Those actions can include any of these:
· Risk avoidance: Some software defects can be avoided by improved software design and development methods, but that is a different topic for a different article. We cannot avoid having at least some undetected software defects, however.
· Risk transference: Our customers and users don't like the idea of bearing the risk of software defects themselves, but I can't think of anyone else to whom we could transfer this risk.
· Contingency planning: We plan for supporting our software products to deal with the defects when they finally come to light, but can't we do better than that?
· Risk mitigation: We can take explicit action to reduce the probability or impact of undetected software defects.
Testing as a Risk Mitigation Strategy
The very act of testing reduces the risk of undetected defects. Because there are a finite (though potentially large) number of defects in any product, every defect we find and fix reduces the risk of undetected defects. Of course, the more testing that we do (mixed with the different kinds of testing we do), the more significant this risk reduction becomes.
We can do better than merely reducing risk by brute force, though. We can also employ Risk-Based Testing to focus our testing efforts in ways that have the greatest mitigating effects.
Though we cannot reduce the probability that there will be undetected defects, we can make it less likely that people will encounter them. For example, we can focus our testing on the things that end users do most often, eliminating many of the defects that are in users' normal path. We can focus testing on the platforms that most of them use, reducing the likelihood that they will experience platform-specific defects. Also, we can be sure to test the most common mistakes that users make, reducing the probability that they will experience error-handling defects.
The other side of risk mitigation is reducing the impact of the undetected defects. This involves prioritizing the software functionality according to the importance of the business processes that each enables, and focusing on the functions for which failure would have the greatest impact. For example, if regulatory compliance requires that certain end-of-quarter processing be correct, we would focus significant testing on that functionality (even though it will only be used four times per year).
Testing to Quantify Risk
Although risk-based testing is a powerful way to mitigate risk, it does not constitute the most significant way in which testing helps us with risk. Any testing (but especially risk-based testing) gives us insights into the risks associated with the product.
Consider the situation from the position of your management and/or customer. Since delivering zero defects is not possible, what is the best thing we can give them? In most cases, the most important thing we can do is to give them a strong basis for deciding if the product is really ready for deployment. The best way we can do this is by quantifying the risks.
Quantifying the risks associated with deploying a particular product starts with understanding what is important to your key stakeholders. Do they care about support costs or business processes? What about position in the marketplace or customers opinions? With this insight in hand, we can quantify the risks to these things based on what we have discovered during testing.
For example, if comprehensive testing of the functionality around a key business process has uncovered few problems, we can express the probability of defects with that high-impact functionality as relatively low, increasing stakeholders' confidence in it. If that testing has uncovered a significant number of defects, though, we then might express the likelihood of more defects as being moderate (or even high), reducing their confidence in it (based on the principle that testing uncovers only a percentage of existing defects).
If time has run out, then the tests that we have not yet run successfully represent areas of risk in the product, reducing confidence. By the same token, uncorrected defects represent risks due to hidden defects (those that can only be detected by additional testing after the blocking defects have been fixed). In fact, any limitation of the time or resources for testing can be expressed in terms of risk in the final product.
Our key stakeholders need good information with which to make deployment decisions. Key among their concerns is the risk of undetected defects. Helping them to understand these risks so they can make informed deployment choices is the most important purpose for software testing.