The Japanese manufacturing revolution—a combination of lean production, quality techniques, and sheer necessity—lifted the Japanese industrial phoenix out of the ashes of World War II until it eventually dominated world manufacturing. The concepts used to rebuild the empire can easily be applied to the software development industry. In this week's column, Clarke Ching describes why the majority of the software industry has done a poor job of incorporating these quality techniques and reveals the one secret ingredient that makes them easy to adopt.
The late W. Edwards Deming, an American statistician, was widely credited with providing much of the know-how and philosophy that drove the Japanese recovery. Ironically, his know-how was developed in America where it was largely ignored. Deming was sent to Japan in 1947, but it wasn't until 1980 when NBC featured him in a television program "If Japan Can ... Why Can't We?" that the West finally took notice. Deming's message slowly spread from Japanese manufacturing into western manufacturing, then into the western service industries.
Deming's key message was to build quality into the processes rather than inspect defects out of a product already built. Although his concepts are acceptable to software development where quality is job number one, the concepts have either not actually been applied or not well. The majority of software development projects still follow the waterfall lifecycle that relies on inspecting defects out, that is, testing the product to remove defects.
Late Inspection
Late inspection and testing are the simplest, most expensive, and least effective way to find bugs. Before manufacturers adopted modern quality techniques their main approach to removing defects was to inspect each product in the last stage of production once it had been built, but before it was shipped. If the product didn't meet specifications, then it was either reworked or scrapped—both expensive options. Late inspection is prone to human error and rarely finds all defects.
Late inspection is also the model primarily used when doing waterfall software development. Rework (which the waterfall is meant to minimize) is pushed to a phase of the project when fixes are the most expensive, when the cost of change is forty to 100 times greater than if the defect was fixed when it was created (Boehm, 2004). This stage has the least wiggle room time to recover from show-stopper problems or unexpectedly large amounts of rework. This unpredictability becomes one of the main reasons why projects miss their schedules.
Early Inspection
Recognizing the high cost of late inspection, manufacturers moved back the inspection process by using self and successive inspection. Self inspection involves operators inspecting their output immediately after processing it. Successive inspection happens when operators inspect the inputs from prior processes. The equivalent processes in software development are walk-throughs, inspections, and unit testing. These approaches provide much earlier feedback when the correction costs are much lower, but still rely on human judgment.
Mistake Proofing—Building Quality into the Process
A more powerful approach is to design your processes to automatically prevent defects from happening. The Japanese name for such techniques is Poka Yoke, meaning mistake proofing. Poka Yoke has two aspects: prevention and detection. For instance, having different shaped plugs for each type of connection in a PC prevents novices and experts from making mistakes when installing a new PC, and the electric fuses and circuit breakers in your home detect when electrical circuits are overloaded which prevents electrical fires.
Likewise, one of the most common causes of defects—ambiguous requirements—can be prevented by writing comprehensive acceptance tests when each requirement is captured. Furthermore, automating these tests and running them as part of frequent integration-builds, help detect defects when they happen.The Missing Ingredient
Deming's core message was that we should stop inspecting defects out of products and start building quality in. Obviously preventing defects or finding them when they are cheapest to fix is preferable to finding them all at the end when they are many, many times more expensive to fix. Yet few software development projects write tests up front, do inspections, or frequent integration despite the benefits.
Why not? Because it is hard work in most environments. It's hard work trying to inspect hundreds of pages of documents. It's hard work trying to write tests for many pages of requirements at the beginning of a project. It is even harder to keep the tests up to date as the requirements change. It's harder still when you realize that you have to inspect the tests.
So what was different in manufacturing environment that allowed manufacturers to adopt an equivalent processes?
The primary difference is that when the Japanese manufacturers adopted Deming's quality practices and philosophy they also adopted what are now known as the lean manufacturing techniques. Lean factories have much less work-in-process compared to traditional factories and this allows them to spot defects much sooner. For instance, compare a factory where it takes three months from when material is released until the product is shipped with a factory where it takes only two days. If a systemic defect occurs in the first process, then it will take three months for the defect to be found in the first factory, but only two days in the second. At the point that the defect is found every product within the factory will have the defect and will need to be scrapped or reworked. The cost of rework and scrap is significantly lower in the factory with lower work-in-process (WIP).
Lean factories achieve much lower levels of WIP by working in smaller batches of work. The equivalent in software development is working in small increments as used by the Agile approaches. It is much easier write tests up front and to automate them when you are working on a much smaller set of functionality at any point in time. When working this way, defects are discovered much sooner, when they are easier to fix, the quality of the application is more transparent, and the odds of delivering on time much higher.
References:
"Balancing Agility and Discipline: A Guide for the Perplexed" by Barry Boehm and Richard Turner.