Sunday, December 04, 2011

Just Fix It

On the writing-about-testing mail list recently was a discussion of defect tracking.  Given a good enough code base and a mature dev team, I think defect tracking is mostly unnecessary, and it's worth talking about why that is. 

Some time ago there was a popular meme in the agile testing community that goes "Just Fix It", but I haven't heard it mentioned in some time, and I think it's worth reviving the discussion.  The idea behind Just Fix It is to bypass the overhead of creating a defect report, having those defect reports go through some sort of triage process, and only then addressing the problems themselves represented by the defect reports.  You save a lot of time and overhead if you Just Fix It. 

For some time now I have specialized in testing at the UI, a combination of functional testing and UX work.  In my experience, in a good code base, important defects found at the UI level are almost always what I think of as "last mile" issues, where the underlying code has changed in some way but the hook for that code into the UI has been mangled or overlooked.  These are cases where unit tests are almost certainly passing, but the app is broken anyway.  Some examples:

  •  Explanatory text has disappeared or no longer describes function accurately.
  •  A widget that used to function no longer does.  For example, a Submit button no longer makes anything happen.
  •  A call to some underlying function is no longer correct.  For example, a Search function that used to return results no longer does.

While a Just Fix It culture is not necessarily agile, examples of Just Fix It are easier to describe in a typical agile situation.

Small Team

Many agile teams share a single space, making communication easy and instantaneous. In such a situation, a conversation like this might happen:

Tester: "Hey, the froobnozzle stopped froobing, anyone know anything about that?"
Dev: "Wow, I didn't realize my last commit would break the froob function, I'll Just Fix It."

This is a canonical example of what Lisa Crispin calls the "whole team approach", where testers, devs, and everyone else is working on the same stories at the same time in the same place simultaneously.

And if it's appropriate, there's no real reason a tester couldn't Just Fix It themselves.

Large Team

But some teams are too large for a conversation like this to be practical.  Assume a collocated team with a really big story board with dozens of story cards all being moved around a lot.  Say a tester finds an issue with the froobnozzle. 

Tester: grabs a red sticky note and writes brief description of froobnozzle problem. Puts sticky note on froobnozzle story card
...minutes later...
Dev: whoa, a red card on my story, better Just Fix It. 

Distributed Team

Distributed teams tend to have really sophisticated issue-tracking systems in place, where stories are represented in software of some sort, where they can be assigned, have their status changed, etc. etc.  If a distributed team is small enough, a tester will know that Joe is working on the froobnozzle story, so:

Tester to Joe on IM: "hi Joe, I think you might have just broken the froobnozzle."
Joe the Dev:  "whoa, good catch, I'll Just Fix It."

Large Distributed Team

On a large distributed team, identifying who might be in a position to Just Fix It can be complicated.  One strategy is to read the commit logs upon identifying a defect to see who or what may have caused the problem.  Another strategy might be to review all the stories in play to discover who might be working on the froobnozzle this iteration. 

But sometimes these sorts of approaches are too complicated or take too much time.  One pattern I have seen on several occasions in large distributed teams is to designate a knowledgeable person on the dev staff, or possibly a Scrum Master type, to represent the whole dev team for questions about behavior or function.  I have seen this role called the Face, and the Ninja, and the Disturbed.  

Tester:  "hi Face, I just discovered the that the froobnozzle got broken within the last day or so."
Face: "whoa, let me check that for you"
Face: "good catch, Joe broke that two commits ago, he's Just Fixing It"

Defect Found in Production

A customer probably reported it.  The fix is deployed to production within minutes or maybe hours of the report.  (Again, a good code base allows this.)

I worry that too often "root cause analysis" is a synonym for "blame".  Defects in production are almost certainly a process problem, and the place to address process problems are in retrospectives or similar conversations. 

Besides, if your team is releasing so many defects to production that you have to track them, you have bigger problems.

Won't Fix

True story:  just this week I was refactoring some Selenium tests and discovered a bug.  This was in a part of the application that is not exposed to customers, it is only for internal users employed by my company.  The bug was that attempting to enter a duplicate record causes an unhandled exception and the user is presented with an ugly stack trace.  This was an old bug, and was not part of the work of the current iteration.

I work on a large distributed team.  As I noted above, we have a sophisticated issue-tracking system in place.  All of the work we do is documented and tracked in this system.  We have no designated defect-tracking system, just a single monolithic sophisticated issue-tracking software application.

Upon finding the bug, I had a conversation with the dev who knows about that part of the code.  We agreed that this was a no-harm-no-foul situation, no data corrupted, minimal inconvenience to the user, no customer exposure.  We agreed that Just Fixing It right this minute wasn't very important.

So I created a new issue in the issue tracking system and assigned it to the dev who knows about that part of the code.  This issue has the same visibility and status as every other issue in the system.  My bug report issue will be treated the same as every other issue in the system, included in the backlog, and prioritized to be worked on with every other issue in the backlog. 

I don't even really think of that particular issue as a defect.  It's just a description of the state of a part of the application, some work that we might choose to do at some point.  I'm sure we'll Just Fix It pretty soon. 





5 comments:

Alex said...

Great tips. What about large distributed teams with offshore team members in different time zones? I've recommended the "red card" analogue using our software collaboration tool for issues within the iteration.

Application Tracking Software said...
This comment has been removed by a blog administrator.
Sunil said...
This comment has been removed by a blog administrator.
abigail said...
This comment has been removed by a blog administrator.
TestWithUs said...
This comment has been removed by a blog administrator.