The Black Swan

A Black SwanI hate the idea of writing a book review for a post. Somehow it strikes me as cheap and lazy to rely so heavily on the work of others for content, particularly when my blog is so new. Shouldn’t I be concerned with sharing my own thoughts instead of parroting the thoughts of others?

Even worse: choosing Nassim Taleb’s The Black Swan (second edition, just released a few weeks ago) as the review’s subject matter. Taleb has notorious disdain for reviewers, many of whom seem to either miss his message entirely* or distort it in some consequential fashion. Given the book’s Kolmogorov complexity, any attempt at encapsulation is bound to leave out something significant (in contrast to the easily summerizable journalistic “idea book of the week” that excites the MBAs and is the intellectual equivalent of fast food. Anyone remember Who Moved My Cheese??).

I think the book’s message is important, and Taleb, being a champion of the skeptical empiricist, says a great deal that should excite and inspire the software tester. So, I’m willing to risk appearing lazy, but let’s not call this post a review so much as a somewhat desultory sampler. The Black Swan is a philosophical essay that is both dense and broad, and explores many interesting ideas–irreverently, I might add. My aim here will be to stick to those ideas that pertain to testing. I’ll leave the rest for you to discover on your own if you should decide to pick up a copy of the book for yourself.

The Black Swan

“All swans are white.”

Before 1697, you could say this, and every sighting of another swan would add firmness to your conviction of its “truth”. But then Europeans discovered a black swan in Western Australia. A metaphor for the problem of induction was born.

Taleb’s Black Swan (note the capitalization) is distinct from the philosophical issue, however. I’ll let Taleb define it:

First, it is an outlier, as it lies outside the realm of regular expectations, because nothing in the past can convincingly point to its possibility. Second, it carries an extreme impact (unlike the bird). Third, in spite of its outlier status, human nature makes us concoct explanations for its occurrence after the fact, making it explainable and predictable.

I stop and summarize the triplet: rarity, extreme impact, and retrospective (though not prospective) predictability. A small number of Black Swans explain almost everything in our world, from the success of ideas and religions, to the dynamics of historical events, to elements of our own personal lives. [emphasis original]

I’m confident that you can already see where this applies in the world of software. A Black Swan would be any serious bug that made it into a released product and caused some sort of harm–either to customers or the company’s reputation (or both!).

Toyota’s recent brake system problems are a perfect example. Clearly they didn’t see this coming, and it’s cost them an estimated $2 billion.  You can bet they’re trying to figure out why they didn’t catch the problem earlier–and why they should have–and how to prevent similar problems in the future.

And there’s the rub! The problem with Black Swans is that they are unpredictable by nature. Reality has “epistemic opacity”, says Taleb, owing to various inherent limitations to our knowledge, coupled with how we often deal erroneously with the information we do have. Toyota might spend billions ensuring that their cars will never have brake problems of any kind ever again, only, perhaps, to find one day that, in certain rare situations, their fuel system catches fire. It happens precisely because it’s not planned for.

So, what can we, as testers, do about the Black Swans we might face? The Black Swan counsels primarily how not to deal with them, and Taleb openly laments the typical reaction to his “negative advice.”

…[R]ecommendations of the style “Do not do” are more robust empirically [see “Negative Empiricism,” below]. How do you live long? By avoiding death. Yet people do not realize that success consists mainly in avoiding losses, not in trying to derive profits.

Positive advice is usually the province of the charlatan [see “Narrative Fallacy,” below]. Bookstores are full of books on how someone became successful [see “Silent Evidence,” below]; there are almost no books with the title What I Learned Going Bust, or Ten Mistakes to Avoid in Life.

Linked to this need for positive advice is the preference we have to do something rather than nothing, even in cases when doing something is harmful. [emphasis original]

I’m reminded of a consulting gig where I explained to the test team’s managers that their method for tracking productivity was invoking Goodhart’s Law and was thus worse than meaningless, since it encouraged counterproductive behavior in the team. The managers agreed with my analysis, but did not change their methodology. After all, they said, they were required to report something to the suits above them. They didn’t seem to have an ethical problem with tracking numbers that they knew were bullshit.


The ancient Greek philosopher Plato had a theory that abstract ideas or “Forms,” such as the idea of the color red, were the highest kind of reality. He believed that Forms were the only means to genuine knowledge. The error of Platonicity, then, as defined by Taleb, is

…our tendency to mistake the map for the territory, to focus on pure and well-defined “forms,” whether objects, like triangles, or social notions, like utopias (societies built according to some blueprint of what “makes sense”), even nationalities. When these ideas and crisp constructs inhabit our minds, we privilege them over other less elegant objects, those with messier and less tractable structures…

Platonicity is what makes us think that we understand more than we actually do. But this does not happen everywhere. I am not saying that Platonic forms don’t exist. Models and constructions, these intellectual maps of reality, are not always wrong; they are wrong only in some specific applications. The difficulty is that a) you do not know beforehand (only after the fact) where the map will be wrong, and b) the mistakes can lead to severe consequences. These models are like potentially helpful medicines that carry random but very severe side effects.

The error of platonification has a lot in common with the error of reification, but there is a subtle difference. Platonification doesn’t require that you believe your model is real (as in, “concrete”), only that it is accurate.

Again I’m sure you’re already thinking of ways this applies in software testing. You build a model of a system you’re testing. Soon you forget that you’re using a model and become blind to scenarios that might occur outside of it. Even worse, you write a few hundred test cases based on your model and convince yourself that, once you’ve gone through them all, you’ve “finished testing.”

Negative Empiricism

I mentioned above that The Black Swan is almost entirely advice about what not to do. However, in the chapter he devotes to confirmation bias and its brethren, Taleb introduces the heuristic of “falsification.” I hope you’ll forgive my quoting rather liberally from the section, here. He seems, for a moment, to be speaking directly to software testers:

By a mental mechanism I call naïve empiricism, we have a natural tendency to look for instances that confirm our story and our vision of the world – these instances are always easy to find. Alas, with tools, and fools, anything can be easy to find. You take past instances that corroborate your theories and you treat them as evidence. For instance, a diplomat will show you his “accomplishments,” not what he failed to do. Mathematicians will try to convince you that their science is useful to society by pointing out instances where it proved helpful, not those where it was a waste of time, or, worse, those numerous mathematical applications that inflicted a severe cost on society owing to the highly unempirical nature of elegant mathematical theories.

The good news is that there is a way around this naïve empiricism. I am saying that a series of corroborative facts is not necessarily evidence. Seeing white swans does not confirm the nonexistence of black swans. There is an exception, however: I know what statement is wrong, but not necessarily what statement is correct. If I see a black swan I can certify that all swans are not white!

This asymmetry is immensely practical. It tells us that we do not have to be complete skeptics, just semiskeptics. The subtlety of real life over the books is that, in your decision making, you need to be interested only in one side of the story: if you seek certainty about whether the patient has cancer, not certainty about whether he is healthy, then you might be satisfied with negative inference, since it will supply you the certainty you seek. So we can learn a lot from data – but not as much as we expect. Sometimes a lot of data can be meaningless; at other times one single piece of information can be very meaningful. It is true that a thousand days cannot prove you right, but one day can prove you to be wrong.

The person who is credited with the promotion of this idea of one-sided semiskepticism is Sir Doktor Professor Karl Raimund Popper, who may be the only philosopher of science who is actually read and discussed by actors in the real world (though not as enthusiastically by professional philosophers)… He writes to us, not to other philosophers. “We” are the empirical decision makers who hold that uncertainty is our discipline, and that understanding how to act under conditions of incomplete information is the highest and most urgent human pursuit. [emphasis original]

It always rankles when I hear someone (who is – usually – not a tester) declare something like “We need to prove the program works.” Obviously anyone who says this has a fundamental misconception of what is actually possible. And how many times has a programmer come to you claiming that he tested his code and “the feature works” – but you discover after only a couple tests that his “tests” were within only a narrow range, outside of which the feature breaks immediately?

All The Rest

I’ve only touched on a very small part of the contents of The Black Swan, but hopefully enough to convince you that it’s required reading for software testers. I’ll close the post with short descriptions of a few of the bigger ideas in the book that I skipped:

  • Mediocristan – A metaphorical country where deviations from the median are small and relatively rare, and those deviations can’t meaningfully affect the total. Think heights and weights of people. Black Swans aren’t possible here.
  • Extremistan – A metaphorical country where Black Swans are possible, because single members of a population can affect the aggregate. Think income or book sales.
  • Ludic Fallacy – Roughly speaking, the belief that you’re dealing with a phenomenon from Mediocristan when it’s actually from Extremistan. The Ludic Fallacy is a special case of the Platonic Fallacy.
  • Narrative Fallacy – The tendency to believe or concoct explanations that fit a complicated set of historical facts because they sound plausible. Conspiracy theories are only a small facet of this. These narratives cause us to think that past events were more predictable than they actually were. We become, as Taleb puts it, “Fooled by Randomness.”
  • Silent Evidence – That part of a population that is ignored because it is “silent,” meaning either difficult or impossible to see. We see all the risk-takers who succeeded in business, but not all risk-takers who failed. The result is the logical error called survivorship bias.

*An example of this is found in the quote from GQ magazine that appears, ironically, on the front cover of the book itself: “The most prophetic voice of all.” Taleb’s point is to be wary of anyone who claims he can predict the future. He says of himself, “I know I cannot forecast.”

  1. Twitted by lippard - pingback on June 21, 2010 at 6:47 am
  2. I wrote an article for Better Software on testing and The Black Swan a while ago ( If I had amplified on it, this blog post looks a lot like what I would have written. Excellent.

    Apropos of the problem of people who are tracking numbers that they know to be bullshit: “There are people who produce forecasts uncritically. When asked why they forecast, they answer ‘Well, that’s what we’re paid to do here.’ My suggestion: get another job.” (p. 163 in the new, paperback edition). I agree.

    Great stuff, man!

    —Michael B.

  3. Thanks, Michael! Much appreciated.

    Excellent quote from the book, too. I wish I’d thought of including it.

  4. Thank you for this post, Abe (and Michael for alerting Twitterville of Abe’s blog). After this, I’ll be sure to get The Black Swan on my book list. Most of these ideas are ones which I understood implicitly, but struggled to express explicitly to my project managers, who sometimes feel a worthy contribution to the testing team is to point out, after a Black Swan is discovered on production, that “we really should test for that sort of bug.” It’s always nice to have some back up for my thinking and new ways to express it to others.

    A tangential story off of the negative empiricism discussion:
    I work in a very small development environment on an enterprise application, and when I first started a couple years ago (my first experience in testing… yes, I’m quite green), the developers wrote all of the test cases. We’re finally growing out of that as our test team expands, but I’ve since discovered that the developers’ test cases actually have their own value: they tell me in what context the developer has coded a feature to work. With that starting point, I have an easier time spiraling out and finding the contexts in which the code will fail; exploratory testing off of a developer-written test case can feel like a test dialogue. For example, if the script says “enter data in the milestone field and verify that the output also appears in the schedule area,” I can ask questions like “Why did the developer only specify that output area? Did he only design it to work for that particular input? What if I input in the schedule area?” Occasionally, the developer’s script immediately reveals to me flaws in the code design.

  5. Jeremy, thank you for the comment. Your point about developers being a great source of testing ideas is well taken!

    It reminds me of something a former boss was fond of saying (of course it wasn’t original to him): “Trust, but verify!”

  6. Testing’s Quiet Evidence | Abe Heward's Blog - pingback on July 5, 2010 at 6:51 am
  7. Another great reference to this book. Despite just receiving another arm load of reading material I will need to pick this one up. Ack!

    Thanks for the review.


  8. It really is required reading for testers!

    Thanks, Lynne!

Leave a Comment

NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Trackbacks and Pingbacks: