"Some birds aren't meant to be caged, their feathers are just too bright"- Morgan Freeman, Shawshank Redemption. This blog is from one such bird who couldn't be caged by organizations who mandate scripted software testing. Pradeep Soundararajan welcomes you to this blog and wishes you a good time here and even otherwise.

Friday, November 07, 2008

Rating testers based on number of bugs they find - It's definitely possible

In a CMM Level 5 organization I worked for about 2 years back, I got an e-mail from my Senior Manager that testers would be rated based on number of bugs they find. At that time I was an ace bug slogger ( means, I logged a lot of bugs ) for the team.


I was the first one to revolt on that idea of rating a tester based on the number of bugs they find despite knowing the fact that it could benefit me a lot. I was aware of the importance of other testers in the team who weren’t as good bug sloggers like me but had some skills that I did not have and were important skills that a test team required. For instance, I wouldn’t execute test cases. A combination of scripted and exploratory testing was important for that team. A customer insisted that the test cases ( about 140 ) were executed and report being sent to him without discarding the idea that we could do exploratory testing. Some testers in the team had brilliant ideas to create test files that helped the team in finding a lot of bugs. Some testers were documentation savvy guys who used to sit hours together patiently to generate release reports. We shared our tasks pretty well.


I replied to the e-mail saying, “Despite I see that this policy could benefit me, I see this as a threat to the value we testers can add and have been adding. I wouldn’t mind to quit the organization if this policy is serious and be put to action”.


The idea of losing a tester like me on a project whose business is a lot of US Dollars sounded bad to the management and they changed the policy to “Testers will be rated based on the number of bugs that customers find” for which, my reply was, “This doesn’t make me any comfortable”.


In this context, I started to look for articles that talk about testers being rated based on number of bugs found by them and bumped into Dr Cem Kaner’s article Don’t Use Bug Counts to Measure Testers and the conclusion is super fantastic “If you really need a simple number to use to rank your testers, use a random number generator. It is fairer than bug counting, it probably creates less political infighting, and it might be more accurate”.


I was very much impressed about the ideas shared there and shared that article with my management. No measurement of bugs to rate testers happened to my knowledge.


Till a couple of days before, I was thinking that testers should not be rated based on number of bugs they find.
Well, its possible.


I read a post from Sharath Byregowda, a tester whom I respect and collaborate with in Bangalore. No, he didn’t write that it’s a good idea to rate testers based on number of bugs they find. I often question myself, my ideas and beliefs.


Can I still use bug count to measure testers although I know it’s a bad idea and make a good judgment out of it ?


Then an idea struck, “Measure testers based on their reaction to being rated based on number of bugs they find”


All good and great testers that I know and those who understanding testing better than the major population of testers would oppose the idea. There you are – you know who understands testing better than your other staff and whom to retain and who cannot be easily replaced.


Those who test whatever information they get are likely to test the product better than those who don’t test the information they have been given.


If you give a specification document, test plan document, test case document to someone who doesn’t test the information ( which most management does ) in it, you have a problem with your hiring and staff. I have witnessed testers who think of those documents as Bible, The Holy Quran and Bhagvad Gita.


Watch the following video. While you watch the video answer my question:

Isn’t what you see in the video strikingly similar to what most testers do?






As you watch the video, you might see the monkey doing things very similar to what scripted testers do and you would be reminded of
Step 1: Open that

Step 2: Click Here,

Expected Result: This should happen and then you get a peanut.


Many such peanuts satisfy your hunger.


Such humans can definitely be rated on the number of times they find things that they are expected to find through a tightly written script. Now you know why some testers use the term “monkey testing” as something that they do and also now you know why their being paid peanuts.


Disclaimer: The usage of monkey training video isn’t to talk ill about the training that was happening to those monkeys there. I respect and appreciate that those monkeys are being trained for a noble job of helping physically handicapped people get their jobs done. Kudos to the team at http://www.monkeyhelpers.org and for their brilliant idea of using monkeys to help physically handicapped humans.
--
Pradeep Soundararajan - Software Testing Videos: http://www.viddler.com/explore/testertested

19 comments:

Dumitru said...

Very interesting angle, but as always there is no BEST way, just a good way.
This approach will miss testers who literally don't care about their "rating" in management eyes, they just do their thing as good as they can. They most definitely won't fight this way of “measuring” not because they agree, but because it's too stupid to be bothered about. This way you’ll find “thinking testers” but you can’t be sure about the one who didn’t say anything.

By the way, this intelligent measurement is used in development as well. You are as good as many defect you fix. Regardless if you fixed 20 defects of misspelling in a form and someone fixed mind blowing technical issue.

Pradeep Soundararajan said...

This approach will miss testers who literally don't care about their "rating" in management eyes, they just do their thing as good as they can. They most definitely won't fight this way of “measuring” not because they agree, but because it's too stupid to be bothered about.

I haven't come across anyone who isn't bothered about his pay.


This way you’ll find “thinking testers” but you can’t be sure about the one who didn’t say anything.

I usually find all bugs in the products I test but report just a few. How many did I find?

That should answer :-)

By the way, this intelligent measurement is used in development as well. You are as good as many defect you fix. Regardless if you fixed 20 defects of misspelling in a form and someone fixed mind blowing technical issue.

How about intentionally introducing a million and fixing a million plus 10? :P

Ravisuriya said...

Even I have come across such incidents, where the tester is seen on the number of bugs submission counts. And, this made uneasy because it disturbed the test team unity and spirit. And, made me to explain the team I am not in hitting the counts or numbers and not in such attitude too.

As Pradeep said, each tester will have unique testing ability that contributes to the team and employer and for customers too. These words are as true, as nature is.

Recently, I was evaluating a tool and submitted my reports to the Management. And presented a demo of the tool. On one screen, the each tester's submission counts, Priority, Severity, Fixes were displayed in graphical representation. Same way for the development too, i.e., the fix statistics for a developer.

Quickly, seeing this graphical statistics, a person said "it is very consolidated view and good to have. It gives the tester efficiency here in this graph and the developer strength in the other graph".

These words were not at all, a motivating words to a team either testing or development. I opposed it during presentation itself and do not make use such process and statistical graphs to tell her/him as a tester or developer. Hope, most of the management relies on such graphical and numerical statistics to rate the tester and developer. If you are doing so, you are not contributing in bringing up a worth and sapient tester and sapient developer for the community, instead it will be a demotivating actions most of the time.

The Sapient Testing and Sapient Skills of a tester, are always not reachable to any of ratings or appraisal. It can be felt and experienced in sapient team and practices; if not, it is time for us to create such sapient environment to transform ourselves into sapient testers with sapient skills.

Love Testing!


Ravisuriya

Raj said...

hey pradeep. liked your way of chanllenging the thought.

i have written a post on my blog some 18 months back against the practice

http://geektester.blogspot.com/2007/09/ponc-those-who-speak-of-progress.html

Rama said...

Hi Pradeep!
Very True statements! Flet happy that someone thinks about this practice!
But this rating has actually got two sides of it (depending on what their mind set is about testing).
I have worked under 2 different managements which used to have such a measure by numbers.
In the first one, though it did not have major impacts in the work style, it indirectly cultivated a healthy competition among the testers. This is because of the fact that the testers had clearly understood that that Goal is Quality and not count! Indeed it motivated us to try harder in a right direction!
Now the second case: No words to say! It’s just like your monkey example! And people are given title for finding many bugs like "Bug Master" and so on!
But if you really happen to read the Bugs reported by the Bug Master, it will never make anyone feel justified for the title been given!
And what makes me feel even more bad is that if people who are in the starting phase of their testing career, are guided with such practices, then will it guide them to the right path?

Dave said...

Indeed an interesting angle Pradeep! However I cannot agree to this statement. Why measure a tester on the number of bugs they find?? If a manager is satisfied with the tester's effort and work, than he is 'worthy'? I can imagine that you rate a tester with the help of a checklist looking at his/her behaviour, communication, knowledge, etc.
If you really want to measure a tester on the number of bugs, than measure the bugs with a high impact on the process. That is more fair, I think....

Pradeep Soundararajan said...

@Rama,

In the first one, though it did not have major impacts in the work style, it indirectly cultivated a healthy competition among the testers. This is because of the fact that the testers had clearly understood that that Goal is Quality and not count!

If numbers cultivated competition then it isn't healthy in all contexts. Most testers I have met who talked about quality didn't know what it means nor were confident what they it meant.

If you still have access to the team, you might want to ask each one of them what quality means.

Now the second case: No words to say! It’s just like your monkey example! And people are given title for finding many bugs like "Bug Master" and so on!
But if you really happen to read the Bugs reported by the Bug Master, it will never make anyone feel justified for the title been given!


Sometimes monkeys need motivation beyond peanuts. The management that you quoted seems to understand that perfectly.

What do they do?

Fire those monkeys and hire other monkeys?


And what makes me feel even more bad is that if people who are in the starting phase of their testing career, are guided with such practices, then will it guide them to the right path?


Well those who choose to be guided by such practices, deserve the (mis)guidance.

Pradeep Soundararajan said...

@Dave,

Why measure a tester on the number of bugs they find?? If a manager is satisfied with the tester's effort and work, than he is 'worthy'? I can imagine that you rate a tester with the help of a checklist looking at his/her behaviour, communication, knowledge, etc.
If you really want to measure a tester on the number of bugs, than measure the bugs with a high impact on the process. That is more fair, I think....


My question is: If we have time to think of measuring testers why not we spend that time retrospecting on what we missed, achieving better coverage, doing tests that we thought "no one would do that", and so on.

I do not have a checklist of things to rate testers. When I worked as a test manager, I didn't rate anyone but was understanding what kind of value each tester was adding to the project.

For instance, there was a tester who was high demonstration of patience to run tests from scripts and the team badly needed him because only he could do that.

A test team requires diversity and you might agree that not many teams have diversity at least in India because everyone in the team has spent a lot of time trying to understand QTP, Winrunner...

I just want to re-iterate on what I said and maybe use a re-phrase: For those managers who rate testers based on number of bugs they find ( which I agree is a very bad idea ), I still have a proposal ( which I think is not as bad idea as the previous one ) to rate based on their reaction to being rated based on the number of bugs they find.

Ravisuriya said...

Pradeep,

Please can you explain bit more what the statement, "I still have a proposal ( which I think is not as bad idea as the previous one ) to rate based on their reaction to being rated based on the number of bugs they find." tells.

I understood it as, kind of attitude and learning that tester shows during different contexts in Testing? Am I right?


Ravisuriya

Pradeep Soundararajan said...

@Ravisurya,


Read all over again and keep reading there is enough explanation there.

Dave said...

@Pradeep

If I understand you correctly, you want to rate them on their psychological response on being rated? On how testers react in their behaviour if they know that they could be 'in the picture' by their managers? (correct me if I am wrong).
Hmmm..that could be a solution, however in that case, people's psychological strength is tested and not their knowledge.
The vision that you build a test team on what you need, is plausible. I prefer that and than rate a tester on how he fits in the team...

Pradeep Soundararajan said...

@Dave,

Thanks for coming back.

If I understand you correctly, you want to rate them on their psychological response on being rated? On how testers react in their behaviour if they know that they could be 'in the picture' by their managers? (correct me if I am wrong).

You are right but a small correction. I wouldn't do that kind of rating. The message was to those kind of managers who want to rate on numbers and aren't willing to get rid of the idea of rating based on numbers.

The vision that you build a test team on what you need, is plausible. I prefer that and than rate a tester on how he fits in the team...

Yeah, I agree. Its the team that matters and not measuring just one sample out in comparison to other people in the team.

Pradeep Soundararajan said...

@Dave,

Sorry I missed responding to this one Hmmm..that could be a solution, however in that case, people's psychological strength is tested and not their knowledge.

I find it too tough to understand how knowledge or any other aspect of human can be tested to an extent that we can rely on those results as we never know how much we know.

James Bach asks a question often to testers: How do you know what you know?

It so happens that sometimes I play a computer game that I have played thousands of times and I face a meteor falling on my tank from 30 degrees north, I get hit and then say, "Pradeep, didn't you know its going to come from there?"

So, do I know it?
Do I know it enough?
How much is enough?

dsfsf said...

Hi Pradeep,

I was a adhere reader of your blog, BUT now a days felt that u r not in good mood, the topics are boring and tasteless. And most of the topic starts with " ME,ME and I,I ".

Gaurav Pandey said...

Among many suggestions, one suggestion to rate software testers would be bug count based, however with a twist.

Lets say, a tester submits 200 defects for a software.
Question: How many of these defects were actually fixed?
Question: How many of these defects came back as "not able to reproduce"
Question: How many of these defects came back as "But this is a feature"
Question: How many of these defects were kept of "deferred"
Question: How many of these defects wera kept as "Known Issue"

Also, it would depend on
1. How many of these defects were discovered using scripted tests?
2. How many of these defects were discovered using exploratory testing? (Why were scripts not available for this defect)

Also, keeping in mind the risk factor:

1. How many were high risk?
2. Were any high risk defects discovered late in the testing cycle.

Additional factor
- Project contraints
- Lack of documentation
- Lack of specifications

The list can go on..

Pradeep Soundararajan said...

@dsfsf,

I was a adhere reader of your blog, BUT now a days felt that u r not in good mood, the topics are boring and tasteless. And most of the topic starts with " ME,ME and I,I ".

I want to introduce you to a reader who enjoys every post and thinks every posts is so vital and tasty to a thirsty testing mind. That reader thinks Tester Tested! is Pradeep's blog and he is free to share whatever he thinks. That reader thinks, who else will write about experiences that Pradeep had in such detail other than Pradeep himself. That reader thinks Pradeep is not writing just to please people. That reader is also aware of other bloggers who writes and choses topic to please people and get more readers. That reader is aware that what matters to Pradeep is not how many people reads his blog but how many have actually derived benefit by spending time reading his blog. That reader suggests you to stop bothering about Pradeep's usage, start, end of "I" and "me" and get the message ( if any ) between those I and Me.

You should meet that reader someday and he will explain to you everything about Pradeep because that reader believes making conclusions about a person without having talked to him for a while and by merely reading some posts is a bad idea.

You should someday meet that reader and I hope you meet him or people like you meet him.

When I was studying 10th grade history, I felt it was boring and waste of my time. After a long time ( about 7 years ) I accidentally got hold of that book while cleaning my room and read the first few pages and realized it was so interesting.

Manage your bias!

dave said...

@Pradeep

So, do I know it?
Do I know it enough?
How much is enough?


These are good questions. I really don't have a fitting answer on these questions. Often you have to rely on your own gut feeling.
It is like during a intake of a tester. Does his/her resume fits your requirements, do you like him/her as a person and does he/she fits in your team.
These considerations are in my opinion the same when you have to rate a tester.

Pradeep Soundararajan said...

@Dave,

It is like during a intake of a tester. Does his/her resume fits your requirements, do you like him/her as a person and does he/she fits in your team.
These considerations are in my opinion the same when you have to rate a tester.


Thanks for bringing the word opinion. When we interview or we claim to assess a person, we are forming an opinion based on information we collect. It remains our opinion based on the tests we did and the information we think we collected. I think of the idea of translating opinions to numbers as a bad idea.

Nothing wrong in framing opinions and everyone have a freedom to do that irrespective of whether we convey it to the other party or not :)

Harish said...

The question still remains - how do you rate testers? It is important to rate fairly to
- have healthy competition in the team
- motivate testers to do better job
- improve productivity of the team

Apart from doing a good job at it, it is important that appraisal process should seem to be fair to all concerned.

This become more and more important as the team size increases. In a bigger team, you may not sufficiently know how each person is performing. You might be required to rate a person who is not directly reporting to you (say reporting via a team lead). Or take a scenario, where you are not rating a person, but doing the second level review. How do you know the review done by direct supervisor is good. And how do you explain a person who comes to you with the grievance that he has not been rated fairly. You can not rely on self appraisal and there is also possibility of bias by the direct supervisor. Also how do you explain to your superiors that your reviews are fair and unbiased.

To be seen as fair, the subjectivity needs to be minimized to the extent possible. How do you achieve this? Some of the indicators of performance could be:
1. Feedback of peers/ Dev team/ Manager/ client
2. Number of bugs found
3. Number of post-delivery bugs (bug leakage)
4. Technical/ domain/ business knowledge of the tester
5. Effort put in by the tester

Lot of other points can be added to this list. But there are shortcoming in using any of the indicators above (as repeatedly mentioned in your blog and I do agree with the objections).

Let's take a hypothetical scenario where you need to rate/ compare two testers. Tester A is reporting high number of bugs, there is no major post delivery defect found in modules tested by him. He/ she gets appreciation from peers (as he is available for help), from client, from dev team (as he finds most of the bugs in round one of testing). He is an expert technically as well as in the domain in which he is testing. The other tester 'B' is exactly opposite of this.

It is obvious that you would rate tester A higher than tester B. If you agree to this, what is the basis of the conclusion. It is sum total of different indicators as stated above (I am not suggesting these indicators are exhaustive or sufficient).

So, I would say that numbers do play a role in indicating performance(though these need to be read with other subjective criteria). Yes, numbers can be manipulated (And here the human judgement is important to see through the manipulation). But numbers do have advantage - objectivity, looks fair to all concerned.

- Harish