One of the cooler, and more counter-intuitive, bits of statistics I

know of concerns the question: “If your doctor performs a 95% reliable

test on you, and it says you have a disease, how worried should you

be?” (Spoiler alert: not as much as you think.)

To start out with, you either have the disease or you don’t; and the

test either returns the correct result or it doesn’t. So that makes

four possibilities, as outlined in this illustration:

Now, one of the first things you’ll notice is that that illustration

isn’t to scale: I assumed at the beginning that the test is 95%

reliable. So the right-hand column, the one marked “Test fails”,

should be a lot narrower. Also, if you’re talking about a

life-threatening disease like colon cancer or tuberculosis, far less

than half the population has that particular disease. So the top row,

the one marked “Sick” should be a lot skinnier as well. Let’s say that

one person in a hundred has this disease.

(Note that it’s also possible to have an asymmetric test, one that’s

95% reliable if you’re healthy, but only 50% reliable if you’re sick.

I’ll ignore this for simplicity.)

So that gives us this:

In other words, if you picked a random person off the street and

administered the test, that would be like throwing a dart at the

picture above: most would fall in the large green area (they’re not

sick, and the test says so), and only a few would fall in the yellow,

orange, or red areas.

But of course you don’t care about the health of the average person on

the street, you care about whether *you* have the disease.

Especially since the test has shown that you have the disease! So

let’s look at that.

If the test comes back positive, that means one of two things: either

you’re sick and the test worked (your dart fell in the orange area),

or else you’re healthy, and the test gave the wrong result (your dart

fell in the yellow area).

What are the probabilities for these cases? Well, let’s assume that

the entire diagram represents one million people. 99% (1,000,000

× 0.99 = 990,000 people) of them are healthy, and of those, 5%

will get a positive result. 1,000,000 × 0.99 × 0.05 =

49,500 people in the yellow area.

1% of the original one million people are sick, and of those, 95% will

get a positive result on the test. 1,000,000 × 0.01 × 0.95

= 9,500 people.

That gives us a total of 49,500 + 9,500 = 59,000 people out of the

original million who got a positive result. 49,500 of them are

healthy, and 9,500 are sick. What’s the likelihood that you’re one of

the sick people? 9,5000/59,000 = 0.16, or 16%, or roughly one chance

in six.

In the next illustration, I’ve cut up the orange and yellow areas and

put them side by side (but not resized anything) to show the contrast.

This illustrates the fact that the probability of A given B

(*p(A|B)*) is not the same as the probability of B given A

(*p(B|A)*): the probability of getting a positive result if

you’re sick is 0.95, but the probability of being sick given that

you’ve gotten a positive result is only 0.16.

This brings up the next question: you’ve gotten a positive result on

the first test, so your doctor recommends doing a second, different

test to make sure. If the second test comes back positive as well, how

much should you worry?

First of all, notice that the conditions have changed: in the general

population, only 1% of people have the disease, but we’re not looking

at the general population anymore; we’re looking only at those people

who tested positive on the first test, and 16% of those have the

disease.

So let’s say you tested positive on both the first and second tests.

How can this happen? Recall that we narrowed down the original

population of one million down to 59,000 people, 9,500 of whom are

sick, and 49,500 of whom are healthy.

Let’s say that the second test is also 95% accurate (and completely

independent of the first test). 95% of the sick people will get a

second positive result: 9,500 × 0.95 = 9025 people. 5% of

the healthy people will also get a positive result: 49,500 ×

0.05 = 2475 people. So given that you’ve gotten two positive results,

the odds of actually being sick are 9025 / (9025 + 2475) = 0.78. At

this point, it’s definitely time to worry.

Of course, I just made up the numbers above for purposes of

illustration. If the test is less than 95% accurate, or if fewer than

1% of people have the disease, you’re going to get more false

positives.

This applies in other areas as well, such as airport security

screening.

According to the FAA,

over 762 million people flew out of US airports in 2007, or roughly 2

million people per day.

Let’s say that Al Qaeda had tried to pull another 9/11, but with ten

planes instead of the original four, and five people per plane. That

means the TSA is looking for 20 people out of 2,000,000, or 0.001% of

passengers on that day. So even if the screening procedure is 99.9%

effective, that still means 2000 false alarms (and 20 captured

terrorists), so 99% of all people flagged as suspect are innocent.

And that’s on that one day when there’s a massive terrorist attack.

Most days there are no hijackings, which skews the numbers even more.

So the vast majority of times when someone is flagged, it’s a false

alarm.

Another example of where this is extremely important is polygraph examinations. They may claim 95% accuracy (which, BTW, is a meaningless number given that the accuracy of such a test is described by an ROC curve, and even one operating point on that curve has two measures of accuracy), but there are a lot of situations where that number is simply meaningless. A counterintelligence screening polygraph for a security clearance is a classic example: What percentage of applicants for a security clearance

actually arespies? It works out to a system where wild accusations are commonplace. The telling part of it is when they deny your application because you failed the counterintelligence poly, they don’t bother to follow up on any of your alleged spy activities. Why? While you may be infinitesimally more likely to be a security threat than somebody who didn’t fail the screening, you’re still pretty darned unlikely to be worthy of investigation.“I’ve conclusively shown that you’re a foreign agent, a drug dealer, a habitual drug user, and you’ve been involved in at least one murder. While it may seem like it would be a good idea for us to investigate these things, we won’t. Good day.” Counterintuitive indeed.