Using rejected defect rate to improve error reporting
- Transfer
Great Friday everyone, friends! At the end of June, we are launching a new group at the QA Specialist course , which will be the focus of today's publication.
There are many indicators by which you can measure the effectiveness of the team of testers. One of them is the Rejected Defect Ratio, or the number of rejected error reports divided by the total number of received reports. You must be thinking that if the number of rejected reports is zero, then that’s good, but it’s not so simple. Let's look at the types of rejected errors, see how they affect the rejected error rate, and calculate the correct ratio for your team.
There are three categories of rejected errors:
Let's start with the mistakes themselves.
There are two types of irreproducible errors. The first is a mistake that is really hard to reproduce. This may be an error resulting from the interaction of several parameters, some of which you do not even know.
Suppose you conducted several tests in a row, and one of the tests changed the configuration parameter from the default value A to some other value B. The error occurs only when the configuration parameter contains the value B and the input value is C. When Trying to reproduce the error, most likely you will want to start with a known state in order to initialize the system (or, possibly, perform a clean installation). No error will occur because the configuration parameter now again contains the default value A.
Another variant of this kind of irreproducible error is when the test really found a defect, but there is no data in the playback information: a step, a specific input value or an understanding that the error occurs only with a certain procedure. As a result, attempts to reproduce the error do not lead to anything.
In both of the above cases, however, there is indeed a defect in the product itself.
The second type of irreproducible error is when the error cannot be repeated because it does not exist. The tester might have noticed something, but misinterpreted, or the system used for testing might have some problem, such as a faulty hardware component, incompatible driver, or incorrect access settings. Attempts to reproduce the error on a properly configured system fail.
Both of these types of errors are usually flagged in the error reporting systems as “rejected - cannot be reproduced.”
This type of error occurs if the tester decides that the product should behave in a certain way and reports an error when the behavior did not meet its expectations. However, a more detailed study of the requirements shows that the tester's expectations were erroneous, and the product actually functioned correctly. That is, the tested product worked correctly, and the tester, who was not sufficiently familiar with the requirements, made a mistake.
Such errors are usually flagged in error reporting systems as “rejected - not error” or “rejected - by architecture” (that is, the behavior is consistent with the architecture).
Repeating errors are those errors that one has already reported, and the next one reports after it. A mistake is repetitive only if the “symptoms” of its appearance are the same. And if the root cause of the error is the same, but the “symptoms” turned out to be different, this is not a repeat of the error!
These errors are usually flagged in error reporting systems as “rejected - duplicate / repeat.”
Obviously, an incorrect error is a kind of waste of time that the tester spent on reproducing the error and reporting it, the time that those who sort the errors spend reading and understanding them, and the time that developers spend trying to reproduce an unreproducible error or to fix (and malfunction) something that did not need this fix.
In addition to the fact that the rejected error ratio or RDR is a measure of the inefficiency of the tester team, he also speaks about the professionalism of testers in general. An error that cannot be reproduced due to the lack of necessary information in the report indicates that the testers were not meticulous and did not work hard enough to reproduce this error using the steps described above. In addition, for errors that are infrequently reproduced, testers generally did not note the low playback frequency in the report.
The appearance of an incorrect error indicates that testers do not fully understand the product requirements. Repeating errors indicate that the testers did not perform a minimum search in the local error database to check if it occurred earlier. Or, it means that the specialist who reported this error was not the first to include the correct keywords in the name to facilitate the search for his other colleagues.
In turn, if it turns out that the error I found is rejected, I’m resentful, because I was considered a layperson. On the one hand, this means that I will defend the errors found. When my report is rejected, I proceed as follows:
On the other hand, a certain probability of error rejection makes me cautious. If I’m not completely sure that I’ve found a bug, I’ll spend some more time checking before reporting. I often ask a colleague if I interpret the requirements correctly, or I check whether the error is reproduced on someone else's system.
The testing team should monitor and strive to reduce the level of RDR. The only question is, what RDR should be considered good?
At first glance, it seems that 0% is the optimal result, but I strongly disagree with this. I believe that when the RDR is kept at some healthy level, this is normal, because if it is close to zero, the testing team obviously suffers from no less serious problems than, say, a too high RDR.
The testing team must make great efforts to achieve an extremely low RDR. Each rejected error will be analyzed to understand what went wrong, and each tester who reported an error that was rejected will have to explain what actually happened and how such a situation can be avoided in the future. As a result, testers will report errors in which they are absolutely sure.
If they notice behavior that they think will harm the usability of the product, they will prefer to take such behavior for granted, rather than justify that they have found a mistake that, in fact, is not a mistake based on requirements. If they have evidence that an error has occurred, but there is no good scenario for reproducing it, they will prefer not to report it; they really don't want to upset themselves. If they encounter a frivolous bug, they may decide not to report it at all, because minor bugs do not always fix it, so why take the risk and be afraid that the error you found will be rejected?
In short, striving for a very low RDR generates stress and unhealthy behavior in the testing team, and also increases the likelihood that some errors will go unnoticed.
We need testers who not only report obvious errors, but also warn of any suspicious behavior in the project. We believe that testers who attach great importance to ensuring that the error does not slip away, even at the cost of duplicate reports, are better than testers who spend hours checking whether a bug has already been reported in reports or not, in fear that they make a duplicate. We want the testers to feel comfortable by questioning the word of the system architect or requirements specification, even if it means that some of their errors will be marked as rejected.
We need testers who are not afraid to make mistakes from time to time. This means that equilibrium is needed, so some small RDR is considered acceptable.
My rule of thumb is that the RDR should be 15 percent. This value is based on my experience with the tester team, which, by all accounts, was a good and effective team. It was our RDR during several projects that went one after another, while the other team, which worked on the same projects and in parallel with us, although it was less product aware and considered less effective, had a 30 percent RDR .
I do not think that there is any justification for this meaning other than my inner feeling. This is definitely not scientific. I will not argue with a team that is aimed at 10 or 20 percent, but I think that putting up with 30 percent or setting a goal of 5 percent is already a problem.
In the end, this is a decision that must be made by the tester team, based on the features of the product, the level of expertise of the team, development model, experience of the development team and much more. I highly recommend you keep an eye on RDR and think about whether you need to do something with it. And if it is too high or low, appropriate measures should be taken.
By tradition, we are waiting for your comments and invite you to a free webinar , which will be held on June 14. See you!
There are many indicators by which you can measure the effectiveness of the team of testers. One of them is the Rejected Defect Ratio, or the number of rejected error reports divided by the total number of received reports. You must be thinking that if the number of rejected reports is zero, then that’s good, but it’s not so simple. Let's look at the types of rejected errors, see how they affect the rejected error rate, and calculate the correct ratio for your team.
There are three categories of rejected errors:
- Irreproducible errors;
- Incorrect errors
- Duplicate errors.
Let's start with the mistakes themselves.
Irreproducible errors
There are two types of irreproducible errors. The first is a mistake that is really hard to reproduce. This may be an error resulting from the interaction of several parameters, some of which you do not even know.
Suppose you conducted several tests in a row, and one of the tests changed the configuration parameter from the default value A to some other value B. The error occurs only when the configuration parameter contains the value B and the input value is C. When Trying to reproduce the error, most likely you will want to start with a known state in order to initialize the system (or, possibly, perform a clean installation). No error will occur because the configuration parameter now again contains the default value A.
Another variant of this kind of irreproducible error is when the test really found a defect, but there is no data in the playback information: a step, a specific input value or an understanding that the error occurs only with a certain procedure. As a result, attempts to reproduce the error do not lead to anything.
In both of the above cases, however, there is indeed a defect in the product itself.
The second type of irreproducible error is when the error cannot be repeated because it does not exist. The tester might have noticed something, but misinterpreted, or the system used for testing might have some problem, such as a faulty hardware component, incompatible driver, or incorrect access settings. Attempts to reproduce the error on a properly configured system fail.
Both of these types of errors are usually flagged in the error reporting systems as “rejected - cannot be reproduced.”
Incorrect errors
This type of error occurs if the tester decides that the product should behave in a certain way and reports an error when the behavior did not meet its expectations. However, a more detailed study of the requirements shows that the tester's expectations were erroneous, and the product actually functioned correctly. That is, the tested product worked correctly, and the tester, who was not sufficiently familiar with the requirements, made a mistake.
Such errors are usually flagged in error reporting systems as “rejected - not error” or “rejected - by architecture” (that is, the behavior is consistent with the architecture).
Duplicate errors
Repeating errors are those errors that one has already reported, and the next one reports after it. A mistake is repetitive only if the “symptoms” of its appearance are the same. And if the root cause of the error is the same, but the “symptoms” turned out to be different, this is not a repeat of the error!
These errors are usually flagged in error reporting systems as “rejected - duplicate / repeat.”
How rejected errors affect a team
Obviously, an incorrect error is a kind of waste of time that the tester spent on reproducing the error and reporting it, the time that those who sort the errors spend reading and understanding them, and the time that developers spend trying to reproduce an unreproducible error or to fix (and malfunction) something that did not need this fix.
In addition to the fact that the rejected error ratio or RDR is a measure of the inefficiency of the tester team, he also speaks about the professionalism of testers in general. An error that cannot be reproduced due to the lack of necessary information in the report indicates that the testers were not meticulous and did not work hard enough to reproduce this error using the steps described above. In addition, for errors that are infrequently reproduced, testers generally did not note the low playback frequency in the report.
The appearance of an incorrect error indicates that testers do not fully understand the product requirements. Repeating errors indicate that the testers did not perform a minimum search in the local error database to check if it occurred earlier. Or, it means that the specialist who reported this error was not the first to include the correct keywords in the name to facilitate the search for his other colleagues.
In turn, if it turns out that the error I found is rejected, I’m resentful, because I was considered a layperson. On the one hand, this means that I will defend the errors found. When my report is rejected, I proceed as follows:
- I check again whether the error is reproducing in my system and add the playback steps if I missed something;
- If my misunderstanding of the requirements was caused by an ambiguous requirement or incorrect documentation, I will insist that the error be marked as a documentation error and closed only when the documentation is corrected;
- If I believe that the product’s behavior when fulfilling the requirement is incorrect, I will talk about the requirements with architects and developers, try to convince them that the requirements need to be updated (in the end, I represent the opinion of the client!);
- If the error is rejected as a duplicate, I will make sure that it was not marked in the same way, or that it does not appear “according to the same scenario”.
On the other hand, a certain probability of error rejection makes me cautious. If I’m not completely sure that I’ve found a bug, I’ll spend some more time checking before reporting. I often ask a colleague if I interpret the requirements correctly, or I check whether the error is reproduced on someone else's system.
Opinion against the complete absence of rejected errors
The testing team should monitor and strive to reduce the level of RDR. The only question is, what RDR should be considered good?
At first glance, it seems that 0% is the optimal result, but I strongly disagree with this. I believe that when the RDR is kept at some healthy level, this is normal, because if it is close to zero, the testing team obviously suffers from no less serious problems than, say, a too high RDR.
The testing team must make great efforts to achieve an extremely low RDR. Each rejected error will be analyzed to understand what went wrong, and each tester who reported an error that was rejected will have to explain what actually happened and how such a situation can be avoided in the future. As a result, testers will report errors in which they are absolutely sure.
If they notice behavior that they think will harm the usability of the product, they will prefer to take such behavior for granted, rather than justify that they have found a mistake that, in fact, is not a mistake based on requirements. If they have evidence that an error has occurred, but there is no good scenario for reproducing it, they will prefer not to report it; they really don't want to upset themselves. If they encounter a frivolous bug, they may decide not to report it at all, because minor bugs do not always fix it, so why take the risk and be afraid that the error you found will be rejected?
In short, striving for a very low RDR generates stress and unhealthy behavior in the testing team, and also increases the likelihood that some errors will go unnoticed.
We need testers who not only report obvious errors, but also warn of any suspicious behavior in the project. We believe that testers who attach great importance to ensuring that the error does not slip away, even at the cost of duplicate reports, are better than testers who spend hours checking whether a bug has already been reported in reports or not, in fear that they make a duplicate. We want the testers to feel comfortable by questioning the word of the system architect or requirements specification, even if it means that some of their errors will be marked as rejected.
We need testers who are not afraid to make mistakes from time to time. This means that equilibrium is needed, so some small RDR is considered acceptable.
Finding the optimal rejected defect coefficient
My rule of thumb is that the RDR should be 15 percent. This value is based on my experience with the tester team, which, by all accounts, was a good and effective team. It was our RDR during several projects that went one after another, while the other team, which worked on the same projects and in parallel with us, although it was less product aware and considered less effective, had a 30 percent RDR .
I do not think that there is any justification for this meaning other than my inner feeling. This is definitely not scientific. I will not argue with a team that is aimed at 10 or 20 percent, but I think that putting up with 30 percent or setting a goal of 5 percent is already a problem.
In the end, this is a decision that must be made by the tester team, based on the features of the product, the level of expertise of the team, development model, experience of the development team and much more. I highly recommend you keep an eye on RDR and think about whether you need to do something with it. And if it is too high or low, appropriate measures should be taken.
By tradition, we are waiting for your comments and invite you to a free webinar , which will be held on June 14. See you!