Monthly Archives: August 2010

My Wish List for a Test Case Tracking Tool

Let’s talk for a bit about tracking test cases. Now, before we get bogged down in semantics and hair-splitting, let me point out that I’ve already made my case against them, as have others. I want to focus here on the tracking, so I’ll just speak broadly about the “test case” – which, for the present discussion, is meant to include everything from test “checks” to testing “scenarios“, or even test “charters“.

Speaking provisionally, it’s probably a good idea to keep track of what your team has tested, when it was tested, by whom, and what the results of the test were. Right? I’ll assume for now that this is an uncontroversial claim in the testing world. I’ll venture further and assert that it’s also probably a good idea to keep a record of the clever (and even no-so-clever) testing ideas that strike you from time-to-time but that you can’t do at the moment, for whatever reason. Even more riskily, I’ll assert that, in general, less documentation is preferable to more.

Assuming you agree with the previous paragraph, how are you tracking your sapient testing?

I’ve used a number of methods. They’ve each had their good and bad aspects. All have suffered from annoying problems. I’ll detail those here, then talk about my imagined “ideal” tool for the job, in the hopes that someone can tell me either a) “I’ll build that for you” (ha ha! I know I’m being wacky), or b) “It already exists and its name is <awesome tracking tool x>.”

MS Word (or Word Perfect or Open Office)

The good: Familiar to everyone. Flexible. I’ve used these only when required by (apparently horribly delusional) managers to do so.

The bad: Flat files. Organizational nightmare. I’ve never seen a page layout for a test case template that didn’t make me depressed or annoyed (perhaps this can be chalked up to a personal problem unique to me, though). Updates to “fields” require typing everything out manually, which is time consuming and error-prone.


The good: Familiar to everyone. Flexible. The matrix format lends itself to keeping things relatively organized and sortable. Easy to add new test cases right where they’re most logical by adding a new row where you want it.

The bad: It’s still basically a flat file with no easy way to track history or generate reports.  Long test descriptions look awful in the cells (though in some ways this can be seen as a virtue). Large matrices become unwieldy, encouraging the creation of multiple spreadsheets, which leads to organizational headaches.


The good: Flexible. Generally the wiki tool automatically stores document revision histories. Everyone is always on the same page about what and where the latest version is. Wikis are now sophisticated enough to link to definable bug lists (see Confluence and Jira, for example).

The bad: Still essentially flat files. Barely better than MS Word, really, except for the history aspect.

FileMaker Pro

The good: It’s an actual database! You can customize fields and page layout exactly how you want them without needing to be a DB and/or Crystal Reports expert. I was in love with FileMaker Pro when I used it, actually.

The bad: It’s been a long time since I’ve used it. I stopped when we discovered that it was prone to erasing records if you weren’t careful. I’m sure that bug has been fixed, but I haven’t had a mind to check back. It’s hard to do some things with it that I started seeing as necessary for true usability (I’ll get to those in my wish list below).


This is a proprietary, web-based database tool in use at “Mega-Corp.”

The good: It’s a database. It tracks both bugs and test cases, and links the two, as appropriate. Can store files related to the test case, if needed. Stores histories of the test cases, and you can attach comments to the test case, if needed.

The bad: Slow. Horrible UI. Test team relies on convoluted processes to get around the tool limitations.

Test Director

The good: A database. A tool designed specifically for testers, so it tracks everything we care about, including requirements, bugs, and test cases. It takes screen shots and automatically stores them, making it easy to “prove” that your test case has passed (or failed). Plus, it helps create your entry into the bug database when you fail a test case.

This tool really has come the closest to my ideal tool as anything I’ve used so far.

The bad: The UI for test set organization leaves a lot to be desired. It forces a particular framework that I don’t particularly agree with, though I can see why they made the choices they did. I also think it doesn’t need to be as complicated as they made it. It would be nice to be given the flexibility to strip out the stuff in the UI that I didn’t care about. Lastly, this tool is exorbitantly expensive! Yikes! For it to be useful at all you need to buy enough seats to cover the entire test team plus at least two, for the business analysts and programmers to have access.

My Imaginary Ideal Tool

What I want most…

I want a tool that organizes my tests for me! I want to be able to quickly add a new test to it at any time without worrying about “where to put it.” This is perhaps the biggest failing of flat files. Some tests just defy quick categorization, so they don’t easily “fit” anywhere in your list.

The database format takes care of this problem, to a large degree, to be sure. The tool will have fields that, among other things, specify the type of test (function, data, UI, integration, etc…), the location of the test, both in terms of the layout of the program from the user standpoint, or of what parts of the code it exercises, et cetera.

All that is great, but I’d like the system to go a step beyond that. I want it to have an algorithm that uses things like…

  • the date the test was last executed
  • the date the related source code was updated (note that this implies the tool should be linked to the programmers’ source control tool)
  • the perceived importance and/or risk level of the test and/or the function being tested
  • other esoteric stuff that takes too many words to explain here

I want it to use that algorithm to determine which test in the database is, at this moment, the most important test (or set of tests, if you choose) I could be running. I then want it to serve it up to me. When I accept it by putting it into a “testing” status, the system will know to serve up the next most important test to whatever tester comes along later. Same goes for when I pass or fail the given test. It “goes away” and is replaced by whatever the system has determined is now “most important” according to the heuristic.

The way I see it, what this does for me is free me from the hassle of document maintenance and worrying about test coverage.  The tests become like a set of 3×5 cards all organized according to importance. You can add more “cards” to the stack, as you think of them, and they’re organized for you. You may not have time to get through the whole “pile” before you run out of time, but at least you can be reasonably confident that the tests you did run were the “right” ones.

The other stuff…

Aside from “what I want most,” this list is in no particular order. It’s not exhaustive either, though I tried my best to cover the essentials. Obviously the tool should include all of the “good” items I’ve already listed above.

  • It should have a “small footprint mode” (in terms of both UI and system resources) so it can run while you’re testing (necessary so you can refer to test criteria, or take and store screen shots) but have a minimal impact on the actual test process.
  • As I said above it should link to the programmers’ source control tool, so that when the programmers check in updates to code it will flag all related test cases so you can run them again.
  • It should link to your bug tracking tool (this will probably require that the tool be your bug tracking database, too. Not ideal, but perhaps unavoidable).
  • It should make bug creation easy when a test has failed (by, e.g., filling out all the relevant bug fields with the necessary details automatically). Conversely, it should make test case creation easy when you’ve found a bug that’s not covered by existing test cases, yet.
  • It should be possible to create “umbrella” test scenarios that supersede other test cases, because those tests are included implicitly. In other words, if you pass one of these “über-cases,” the other test cases must be considered “passed” as well, because they’re inherent in the nature of the über-case. The basic idea here is that the tool should help you prevent avoidable redundancies in your testing efforts.
  • Conversely, the failing of a test case linked to one or more über-cases should automatically mark those über-cases as not testable.

I’d love comments and criticisms on all this. Please feel free to suggest things that you’d like to see in your own ideal tool. Maybe someone will actually be inspired to build it for us!


Likely Posting Rates for the Near Future

In May when I started this blog I was working the final weeks as a contractor at a soul-killing corporation on a testing project that was as mind-numbing as it was dysfunctional. I started the blog as a creative outlet for me; a means of venting my frustrations constructively, since I felt like nothing I said at “Mega-Corp” made any difference.

In addition, I thought, I’d soon be back on the job market. The blog might become a good extension to the tired and typical job-seeker’s resume and cover letter. I saw it as a potential means of showcasing my philosophy and thought processes, as well as my writing style and personality, outside the tight confines of a job interview.

I had no expectations beyond that. I figured blog traffic would max out at around a visit a week. Probably those would be my polite and supportive friends, whom I’d pester to check out my latest ramblings (even though they had no interest in testing, software, or epistemology).

Then something funny happened. As near as I can figure it, a friend tweeted about one of the posts. This tweet was apparently seen by Michael Bolton, who presumably read it, liked it, and also tweeted about it. Suddenly there were intelligent comments from strangers (and respected industry celebrities) who were located all over the world. Suddenly posts were being mentioned elsewhere and included in blog carnivals. Suddenly people other than me were tweeting my posts. Wow!!! Who knew there was a large and vibrant testing community out there? Who knew I actually said anything interesting? Suddenly I felt pressure to maintain a consistent output of new, interesting material.

My contract with Mega-Corp ended. Based on the sparse job prospects over the previous six months, I fully expected to be facing a long stretch of unemployment. I have significant savings, so the idea didn’t scare me. In fact, I was genuinely looking forward to it. Aside from now having ample time to write blog posts, I could engage with this newly discovered testing community via Twitter, their own blogs, LinkedIn, the Software Testing Club, and elsewhere. I could spend a few hours a day learning Ruby–something I’d wanted to do for a while but seemed never to have time for.

Although I went to one interview during the first week of unemployment at the behest of the staffing firm I’d been contracting with, I wasn’t particularly interested in looking for work. I jokingly referred to my unemployment as an “involuntary sabbatical.” What little effort I put toward a job search was haphazard and frivolous. The few job listings that turned up were basically of the sort that had been appearing for the previous several months. They fit into one of three categories:

  1. Positions for which I was overqualified
  2. Positions which I knew I could do but I’d never get the interview for, since they listed specific technical requirements I couldn’t in good conscience put on my resume or in my cover letter
  3. Positions that were the software development equivalent of Gitmo prisoner stress positions

Then something funny happened. On day 11 of the sabbatical I got an email from a headhunter asking if I were looking for work. I wrote back and said that I was. She called. We talked for about 20 minutes. I think most of that time was me saying that my technical skills didn’t match what they had on the list of requirements. She said “Let’s submit anyway.” I said, “Sure. What the hell?” I was convinced it would go nowhere and went back to the exercises in my Ruby book. Less than an hour later the headhunter called back and said that the company wanted to interview me the next day at 9 a.m. I said, “Sure. What the hell?”

Armed with the company’s name and address, I started the requisite Googling. I found out that the company’s culture included things like letting people bring their dogs to work, giving everyone a Nerf gun and, most importantly to me, no dress code (based on photos on the company’s blog, shorts and flip-flops were standard fare, so my Vibrams would fit right in). So far, so good. Even better, the company was apparently wildly profitable and newly purchased by a larger firm, also profitable. No worries about job evaporation due to investor indictment!

I’ve been on a lot of job interviews this year. In all of them I felt a lack of control, like I was being forced to justify myself or excuse myself. For this one, though, I decided to take a different tack, since I truly didn’t care if I got the job or not. I took a copy of the advertised job requirements with me and went through them line-by-line with the interviewer, saying “What do you mean by X? My current experience with it is limited. I have no doubt I can learn it, but if it’s really important to you, then I’m probably not your guy.” I must’ve said some variant of that a half dozen times. I felt like I was trying to talk them out of picking me.

Somehow the interview lasted three hours. They told me they were going to talk with two more people, but that they wanted to move fast on a decision, so I would know either way by the next day. I could tell they liked me. For my part, the company struck me as a happy place, and what they wanted to hire me for seems to have become my own career specialty: Use your skills and expertise to do whatever is necessary to create a test department where there is none. As I was driving home I was thinking, “Dammit! I may have to cut my sabbatical short.”

I got a call from the headhunter less than two hours later. They were offering me the job. They wanted me to start tomorrow, if I was willing. I agonized over the decision for most of the afternoon. Three to six months of taking it easy, blogging, and learning Ruby, while looking for the perfect job–I had a really hard time giving up this romantic notion, but it seemed like the perfect job had already arrived, just way ahead of schedule. What if I turned it down and the next one didn’t come along for another year, well after my savings had evaporated?

I took the job.

This post has turned into something much more long-winded and shamelessly self-indulgent than I imagined it would be. Thanks for putting up with it. My only point has been to explain that my new job responsibilities over the coming weeks will probably sap my time and my creative energies. The testing problem I’ve been given is very interesting, and I need to focus on how to solve it. So, for the next few weeks, at least, there’s little chance I’ll be writing a post per week. I can’t imagine, though, that it will be too long before I feel a strong urge to vent again.


Requirements: Placebo or Panacea?

Perhaps the definitive commentary on "requirements"I vividly remember the days when I matured as a tester. I was the fledgling manager of a small department of folks who were equal parts tester and customer support. The company itself was staffed by primarily young, enthusiastic but inexperienced people, perhaps all given a touch of arrogance by the company’s enormous profitability.

We had just released a major update to our software–my first release ever as a test manager–and we all felt pretty good about it. For a couple days. Then the complaints started rolling in.

“Why didn’t QA find this bug?” was a common refrain. I hated not having an answer.

“Well… uh… No one told me the software was supposed to be able to do that. So how could we know to test for it? We need more detailed requirements!” (I was mindful of the tree cartoon, which had recently been shared around the office, to everyone’s knowing amusement.)

The programmers didn’t escape the inquisition unscathed, either. Their solution–and I concurred–was, “We need a dedicated Project Manager!”

Soon we had one. In no time, the walls were papered with PERT charts. “Critical path” was the new buzzword–and, boy, did you want to stay off that thing!

You couldn’t help but notice that the PERT charts got frequent revisions. They were pretty, and they gave the impression that things were well-in-hand; our path forward was clear. But they were pretty much obsolete the day after they got taped to the wall. “Scope creep” and “feature creep” were new buzzwords heard muttered around the office–usually after a meeting with the PM. I also found it odd that the contents of the chart would change, but somehow the target release date didn’t move.

As for requirements, soon we had technical specs, design specs, functional specs, specs-this, specs-that… Convinced that everything was going to be roses, I was off and running, creating test plans, test cases, test scripts, bla bla…

The original target release date came and went, and was six months gone before the next update was finally shipped. Two days later? You guessed it! Customers calling and complaining about bugs with things we’d never thought to test.

Aside from concluding that target release dates and PERT charts are fantasies, the result of all this painful experience was that I came to really appreciate a couple things. First, it’s impossible for requirements documents to be anything close to “complete” (yes, a heavily loaded word if I’ve ever seen one. Let’s provisionally define it as: “Nothing of any significance to anyone important has been left out”). Second, having document completeness as a goal means spending time away from other things that are ultimately more important.

Requirements–as well as the team’s understanding of those requirements–grow and evolve throughout the project. This is unavoidable, and most importantly it’s okay.

Apparently, though, this is not the only possible conclusion one can reach after having such experiences.

Robin F. Goldsmith, JD, is a guy who has been a software consultant since 1982, so I hope he’s seen his fair share of software releases. Interestingly, he asserts here that “[i]nadequate requirements overwhelmingly cause most project difficulties, including ineffective ROI and many of the other factors commonly blamed for project problems.” Here, he claims “The main reason traditional QA testing overlooks risks is because those risks aren’t addressed in the system design… The most common reason something is missing in the design is that it’s missing in the requirements too.”

My reaction to these claims is: “Oh, really?”

How do you define “inadequate” in a way that doesn’t involve question begging?
How do you know it’s the “main” reason?
What do you mean by “traditional QA testing”?

Goldsmith addresses that last question with a bit of a swipe at the context-driven school:

Many testers just start running spontaneous tests of whatever occurs to them. Exploratory testing is a somewhat more structured form of such ad hoc test execution, which still avoids writing things down but does encourage using more conscious ways of thinking about test design to enhance identification of tests during the course of test execution. Ad hoc testing frequently combines experimentation to find out how the system works along with trying things that experience has shown are likely to prompt common types of errors.

Spontaneous tests often reveal defects, partly because testers tend to gravitate toward tests that surface commonly occurring errors and partly because developers generally make so many errors that one can’t help but find some of them. Even ad hoc testing advocates sometimes acknowledge the inherent risks of relying on memory rather than on writing, but they tend not to realize the approach’s other critical limitations.

By definition, ad hoc testing doesn’t begin until after the code has been written, so it can only catch — but not help prevent — defects. Also, ad hoc testing mainly identifies low-level design and coding errors. Despite often being referred to as “contextual” testing, ad hoc methods seldom have suitable context to identify code that is “working” but in the service of erroneous designs, and they have even less context to detect what’s been omitted due to incorrect or missing requirements.

I’m not sure where Goldsmith got his information about the Context-driven approach to testing, but what he’s describing ain’t it! See here and here for much better descriptions.

Goldsmith contrasts “traditional QA testing” with something he calls “proactive testing.” Aside from “starting early by identifying and analyzing the biggest risks,” the proactive tester

…enlists special risk identification techniques to reveal many large risks that are ordinarily overlooked, as well as the ones that aren’t. These test design techniques are so powerful because they don’t merely react to what’s been stated in the design. Instead, these methods come at the situation from a variety of testing orientations. A testing orientation generally spots issues that a typical development orientation misses; the more orientations we use, the more we tend to spot. [my emphasis]

What are these “special risk identification techniques”? Goldsmith doesn’t say. To me, this is an enormous red flag that we’re probably dealing with a charlatan, here. Is he hoping that desperate people will pay him to learn what these apparently amazing techniques are?

His advice for ensuring a project’s requirements are “adequate” is similarly unhelpful. As near as I can figure it, reading his article, his solution amounts to making sure that you know what the “REAL” [emphasis Goldsmith] requirements are at the start of the project, so you don’t waste time on the requirements that aren’t “REAL”.

Is “REAL” an acronym for something illuminating? No. Goldsmith says he’s capitalizing to avoid the possibility of page formatting errors. He defines it as “real and right” and “needed” and “the requirements we end up with” and “business requirements.” Apparently, then, “most” projects that have difficulties are focusing on “product requirements” instead of “business requirements.”

Let’s say that again: To ensure your requirements are adequate you must ensure you’ve defined the right requirements.

I see stuff like this and start to wonder if perhaps my reading comprehension has taken a sudden nose dive.