base rate fallacy

Forums: Science And Math, Riddles, Science, Math, Statistics
Email this Topic • Print this Page

i was tempted to post this in the Riddles forum to get more participation, but strictly speaking it's a type of math problem i came across in a book i read recently, <Fooled by randomness>. here's a good example: do try to solve it without googling please.

Quote:

"A cab was involved in a hit and run accident at night. Two cab companies, the Green and the Blue, operate in the city. 85% of the cabs in the city are Green and 15% are Blue.

A witness identified the cab as Blue. The court tested the reliability of the witness under the same circumstances that existed on the night of the accident and concluded that the witness correctly identified each one of the two colors 80% of the time and failed 20% of the time.

What is the probability that the cab involved in the accident was Blue rather than Green?"

here's a clue: the answer is not 80%. it would suffice to explain why it's not 80% rather than calculate the correct probability, although that's not too hard to do.

i also invite you to post any other common miscalculations of probability you can think of.

Topic Stats
Top Replies
Link to this Topic

Type: Discussion • Score: 0 • Views: 3,432 • Replies: 19

No top replies

damn i sux at math...

id say like either 60% or 90%.

just guessing though.

0 Replies

60's a better guess, but not good enuf :wink:

0 Replies

0.85X0.80=0.68X100% = 68% ?

0 Replies

nice try but not close. :wink:

what you need to calculate are

1) how many cars get identified as blue, remembering that 20% of green cars and 80% of blue cars get identified as blue
2) how many of those are actually blue, which is 1) - 20% of green cars
3) divide 2) by 1)

1 Reply

@yitwail,

Nice to see you, Yitwail!

I'm not convinced that that's "you need to calculate". The percentage of green and blue cars in the street was the same when the witness gave her testimony as it was when the court tested her reliability. So the effect of this distribution should cancel out, and the answer is 80%, as you would naively guess.

Why is this argument wrong?

1 Reply

@Thomas,

good to see you too, Thomas. Smile

your argument only applies when the number of green & blue cars are roughly equal. because one outnumbers the other substantially, the mistakes in identifying the common colored cars become significant compared to the overall number of uncommon colored cars.

1 Reply

@yitwail,

yitwail wrote:

because one outnumbers the other substantially, the mistakes in identifying the common colored cars become significant compared to the overall number of uncommon colored cars.

I understand that. But you said the court tested the witness's reliability under the same conditions as those to which the witness testified. So the mistake is more significant in the real experiment (the testimony) than in a 50:50 distribution. I get that. But it's also more significant, and by the same margin, in the control experiment (the court's evaluation of the witness). Given that the distribution of mistakes is skewed in identical ways in both the experiment and the control experiment, the percentage of correct identifications should be the same in both, too.

2 Replies

@Thomas,

This is why the sharks make so damn much money...

0 Replies

@Thomas,

Thomas wrote:

I understand that. But you said the court tested the witness's reliability under the same conditions as those to which the witness testified. So the mistake is more significant in the real experiment (the testimony) than in a 50:50 distribution. I get that. But it's also more significant, and by the same margin, in the control experiment (the court's evaluation of the witness). Given that the distribution of mistakes is skewed in identical ways in both the experiment and the control experiment, the percentage of correct identifications should be the same in both, too.

how's this, then: the court observed the witness as 100 taxis drove by, with an 85-15 distribution, and the witness identified each color with 80% accuracy. therefore, 80% of 85 = 68 & 80% of 15 = 12 cars were identified correctly, for a total of 68 + 12 = 80 cars as expected. now, if the witness were to see 15 blue cars in a row, 12 would be identified as blue, but if he sees 85 green cars in a row, then 85 - 68 = 17 would also be (mis)identified as blue. the 80% accuracy still holds, but because one color is so dominant over the other, the misidentification of the dominant color outweighs the correct identification of the infrequent color.

1 Reply

Didn't we discuss this on another thread?

For one car the witness' accuracy is 80%. The inaccuracy only builds up if you are counting many independent events.

0 Replies

we did discuss it elsewhere, but i'm not motivated enough to find a link to it at the moment. Embarrassed

however, isn't the accuracy for one car either 100% or 0%, and to get any other percentage, one has to find an average? Confused

1 Reply

@yitwail,

Oh dear, are we going to be pedantic?

For one cab the probability of being correct is 80%.

This may or may not be accurate enough for a court of law. Rolling Eyes

0 Replies

@yitwail,

Ah! So there is a difference between the real experiment and the control experiment! The difference is that in the control experiment, the witness sees a distribution of 85% green cars, 15% blue cars, which he identifies with error rate e. In the real experiment, on the other hand, the witness sees a distribution of 100% blue cars, 0% green cars, which he identifies with e'. Because the distribution of cars to be identified differs, the error rates differ.

Better?

0 Replies

This is a classic example of Bayes' Law ( see http://en.wikipedia.org/wiki/Bayes'_theorem ).

$http://upload.wikimedia.org/math/1/8/8/188019d193258f9ba310da979906d24f.png$

P(A) is the chance that the car is blue. ( which is 0.15)
P(B) is the chance that the witness will identify the car as blue. (which is 0.29 see below for calculation).
P(A|B) is the probability that the car is blue given that the witness identified the car as blue (this is what we want to find).
P(B|A) is the probability the car will be identified as blue, given that the is blue (which is .80 (the chance of being correct))

P(B) is .15*.80 + .85*.20 or or .29

So The answer I get is (.80 * .15)/.29 or about 41%

2 Replies

@ebrown p,

Now I really understand it. Thanks ebrown!

1 Reply

@Thomas,

I thought the wikipedia article (as usual) had a pretty good explanation with illustrative examples.

The drug testing example in this article is interesting... when the number of drug users in a population is small, the number of false positives will dwarf the number of true positives.

1 Reply

Let's try another explanation. Imagine there are 1000 cabs... each one is seen one time.

By the initial figures there will be 850 green cabs, and 150 blue cabs.

Of the 850 green cabs, 680 (i.e. 80%) will be identified as green, and 175 will be identified as blue.

of the 150 blue cabs, 30 (20%) will be identified as green, and 120 will be identified as blue.

So 175 green cabs will be identified as blue.
... and 120 blue cabs will be identified as blue.

So of the 295 cabs identified as blue, 175 will really be green and 120 will really be blue.

So the odds that a cab identified as blue really being blue is 120/295, or about 41%.

Note again, the interesting part is that since there are far more green cabs than blue... the number of false reports of blue cabs is greater than the number of true blue cabs.

0 Replies

@ebrown p,

eBrown wrote:

The drug testing example in this article is interesting... when the number of drug users in a population is small, the number of false positives will dwarf the number of true positives.

Another interesting application is when witnesses to a crime testify to the race of the criminal. Could a significant number of minority prison inmates have been wrongly convicted by getting on the wrong side, not of federal or state law, but of Bayes's Law?

0 Replies

@ebrown p,

couldn't have said it better myself ;-) thanks ebp

0 Replies

base rate fallacy

Related Topics

Quick Links

My Account

able2know