How to compute % defects removed from release candidate code

Recently someone on asked me to explain how to compute the defect removal rate for release candidate software.  There are two methods for producing this number and I teach both in several of my seminars, but I’ll explain the simpler method in this post…

Lawrence Putnam presented this model in his 1992 Book titled Measures for Excellence.  His book reads more like a math text than a software development guide, and suffers from an unfortunate formula typo which has lead to widespread confusion about his models in the industry, but I will  explain his defect removal rate calculation process.  (I hired a math wizard to examine his data and correct the formula!)

1. For a typical project, code is produced at a rate which resembles a Rayleigh curve.  A Rayleigh curve looks like a bell curve with a long-tail.  See my ASCII graphics below:


2. Error ‘creation’ typically happens in parallel and proportional to code creation.  So, you can think of errors created (or injected) into code as a smaller Rayleigh curve:


where ‘|’ represents code, and ‘+’ represents errors

3. Therefore, as defects are found, their ‘detection rate’ will also follow a Rayleigh curve.  At some point your defect discovery rate will peak and then start to lesson.  This peak, or apex, is about 40% of the volume of a Rayleigh curve.

4. So, when your defect rate peaks and starts to diminish, factor the peak as 40% of all defects found, then use regression analysis to calculate how many defects are still in the code and not found yet.

By regression analysis I mean if you found 37 defects at the apex after three weeks of testing, you know two things:  37 = 40% of defects in code, so code contains ~ (37 * 100/40) = ~ 93 errors total, and your finding about 10.2 defects per week, so total testing time will be about 9 weeks.

Of course, this assumes complete code coverage and a constant rate of testing.

Hope this is clear.

Mike J. Berry