Did you ever ask yourself what testing does in the big perspective? Surely you did :) My answer would be: testing is reducing the risk that users will face any issue with the software. In this respect this is very close to what other engineering specialists do. Same way they reduce risk that building collapses on a severe earthquake or that a car will suddenly be on flame while you drive.
Did you ever wonder what techniques other engineers use? Well, I guess most of didn't, in which case this article will be of help.
I have worked with the software that implemented one of most used strategy for risk reduction - Failure Mode and Effect Analysis (hereafter FMEA for the sake of conciseness).
The key in risk management is keeping the possible bad effects in sight. So, it's all about perception. You need to imagine what can go wrong with your application in user hands and build up the strategy how to prevent that particular risk from happening. In case of car manufacturing there can be the risk of losing a wheel at a high speed. Engineers will think of special means to prevent a wheel from spinning away even if the nuts get loose. Solving one such problem engineers advance to the next one, starting from most severe one (severity is this case is a combination of probability and impact).
Now back to the software testing. The risk that our users may face are caused by software defects. Of course this is unrealistic to foresee the defects. But we can foresee the consequences of malfunctions. Let's start from the very beginning. The very first thing a user do is trying to set up the application. So, the worst thing that may happen is setup failure preventing further use of the software. We have found the failure mode, but this is not enough to start creating prevention strategy as this risk is too obscure. The point is to be as precise in description mode as possible.
The setup may fail in many different ways:
- Setup failure in the default setup conditions
- Setup failure due to changing setup options
- Setup failure in unattended mode
- Setup failure while running through the deployment center
- Setup failure due to software compatibility
- Setup failure due to hardware compatibility
- Setup failure due to old version compatibility
- Setup racing failure
- Setup performance is unacceptable low
Now we have the list of possible failure scenarios that can be addressed by testing. Each of the items above has a different combination of probability and impact. For example, "Setup failure in the default setup conditions" may have low probability but highest impact. Meanwhile, risk "Setup failure while running through the deployment center" may have lower impact but the highest probability.
FMEA has a lot of templates that will help to summarize and analyze this information. You can easily find the on the internet. Most of those will be overburdened with the information you will hardly need in analysis, so I suggest my own variant of the table:
## | Failure mode | Impact | Probability | Prevention | Comments |
Prevention depends on the context, so I can only provide you several examples.
One of the systems that I worked on with test team was a very old and big client-server. The reliability was the real problem, so I have put this risk on a table and started to think of the ways to change things to better. The probability of the failure mode was estimated as medium, the impact was highest. Prevention included non-stop automated reliability tests on a dedicated server 24x7. In result, it helped to find major issues that could never be found by other types of testing.
So, the bottom line is:
- Strategies developed in other industries work in software development too, so don't just neglect those finding only because we-are-so-different. This is not the case, nor the excuse.
- Try to foresee possible failure scenarios.
- Build up prevention strategy (which can include not testing only, but all process stages)
- Define a plan where all the required prevention steps will be listed.
- Work to the plan but keep an eye on FMEA table. Things may change as you go. So, the correction may be needed.
Hope this helps! I would be happy to learn what you think.