0
   

base rate fallacy

 
 
yitwail
 
Reply Thu 7 Aug, 2008 12:09 am
i was tempted to post this in the Riddles forum to get more participation, but strictly speaking it's a type of math problem i came across in a book i read recently, <Fooled by randomness>. here's a good example: do try to solve it without googling please.

Quote:
"A cab was involved in a hit and run accident at night. Two cab companies, the Green and the Blue, operate in the city. 85% of the cabs in the city are Green and 15% are Blue.

A witness identified the cab as Blue. The court tested the reliability of the witness under the same circumstances that existed on the night of the accident and concluded that the witness correctly identified each one of the two colors 80% of the time and failed 20% of the time.

What is the probability that the cab involved in the accident was Blue rather than Green?"


here's a clue: the answer is not 80%. it would suffice to explain why it's not 80% rather than calculate the correct probability, although that's not too hard to do.

i also invite you to post any other common miscalculations of probability you can think of.
  • Topic Stats
  • Top Replies
  • Link to this Topic
Type: Discussion • Score: 0 • Views: 2,978 • Replies: 19
No top replies

 
OGIONIK
 
  1  
Reply Thu 7 Aug, 2008 12:40 am
damn i sux at math...

id say like either 60% or 90%.

just guessing though.
0 Replies
 
yitwail
 
  1  
Reply Thu 7 Aug, 2008 12:48 am
60's a better guess, but not good enuf :wink:
0 Replies
 
mars90000000
 
  1  
Reply Sat 9 Aug, 2008 04:54 pm
0.85X0.80=0.68X100% = 68% ?
0 Replies
 
yitwail
 
  1  
Reply Sun 10 Aug, 2008 09:48 am
nice try but not close. :wink:

what you need to calculate are

1) how many cars get identified as blue, remembering that 20% of green cars and 80% of blue cars get identified as blue
2) how many of those are actually blue, which is 1) - 20% of green cars
3) divide 2) by 1)
Thomas
 
  1  
Reply Thu 14 Aug, 2008 03:28 pm
@yitwail,
Nice to see you, Yitwail!

I'm not convinced that that's "you need to calculate". The percentage of green and blue cars in the street was the same when the witness gave her testimony as it was when the court tested her reliability. So the effect of this distribution should cancel out, and the answer is 80%, as you would naively guess.

Why is this argument wrong?
yitwail
 
  1  
Reply Thu 14 Aug, 2008 07:10 pm
@Thomas,
good to see you too, Thomas. Smile your argument only applies when the number of green & blue cars are roughly equal. because one outnumbers the other substantially, the mistakes in identifying the common colored cars become significant compared to the overall number of uncommon colored cars.
Thomas
 
  1  
Reply Thu 14 Aug, 2008 07:21 pm
@yitwail,
yitwail wrote:
because one outnumbers the other substantially, the mistakes in identifying the common colored cars become significant compared to the overall number of uncommon colored cars.

I understand that. But you said the court tested the witness's reliability under the same conditions as those to which the witness testified. So the mistake is more significant in the real experiment (the testimony) than in a 50:50 distribution. I get that. But it's also more significant, and by the same margin, in the control experiment (the court's evaluation of the witness). Given that the distribution of mistakes is skewed in identical ways in both the experiment and the control experiment, the percentage of correct identifications should be the same in both, too.
Rockhead
 
  1  
Reply Thu 14 Aug, 2008 07:26 pm
@Thomas,
This is why the sharks make so damn much money...
0 Replies
 
yitwail
 
  1  
Reply Thu 14 Aug, 2008 10:42 pm
@Thomas,
Thomas wrote:
I understand that. But you said the court tested the witness's reliability under the same conditions as those to which the witness testified. So the mistake is more significant in the real experiment (the testimony) than in a 50:50 distribution. I get that. But it's also more significant, and by the same margin, in the control experiment (the court's evaluation of the witness). Given that the distribution of mistakes is skewed in identical ways in both the experiment and the control experiment, the percentage of correct identifications should be the same in both, too.


how's this, then: the court observed the witness as 100 taxis drove by, with an 85-15 distribution, and the witness identified each color with 80% accuracy. therefore, 80% of 85 = 68 & 80% of 15 = 12 cars were identified correctly, for a total of 68 + 12 = 80 cars as expected. now, if the witness were to see 15 blue cars in a row, 12 would be identified as blue, but if he sees 85 green cars in a row, then 85 - 68 = 17 would also be (mis)identified as blue. the 80% accuracy still holds, but because one color is so dominant over the other, the misidentification of the dominant color outweighs the correct identification of the infrequent color.
DrewDad
 
  1  
Reply Fri 15 Aug, 2008 12:06 am
Didn't we discuss this on another thread?

For one car the witness' accuracy is 80%. The inaccuracy only builds up if you are counting many independent events.
0 Replies
 
yitwail
 
  1  
Reply Fri 15 Aug, 2008 02:50 am
we did discuss it elsewhere, but i'm not motivated enough to find a link to it at the moment. Embarrassed however, isn't the accuracy for one car either 100% or 0%, and to get any other percentage, one has to find an average? Confused
DrewDad
 
  1  
Reply Fri 15 Aug, 2008 06:58 am
@yitwail,
Oh dear, are we going to be pedantic?

For one cab the probability of being correct is 80%.

This may or may not be accurate enough for a court of law. Rolling Eyes
0 Replies
 
Thomas
 
  1  
Reply Fri 15 Aug, 2008 08:15 am
@yitwail,
Ah! So there is a difference between the real experiment and the control experiment! The difference is that in the control experiment, the witness sees a distribution of 85% green cars, 15% blue cars, which he identifies with error rate e. In the real experiment, on the other hand, the witness sees a distribution of 100% blue cars, 0% green cars, which he identifies with e'. Because the distribution of cars to be identified differs, the error rates differ.

Better?
0 Replies
 
ebrown p
 
  1  
Reply Fri 15 Aug, 2008 02:02 pm
This is a classic example of Bayes' Law ( see http://en.wikipedia.org/wiki/Bayes&#039;_theorem ).

http://upload.wikimedia.org/math/1/8/8/188019d193258f9ba310da979906d24f.png

P(A) is the chance that the car is blue. ( which is 0.15)
P(B) is the chance that the witness will identify the car as blue. (which is 0.29 see below for calculation).
P(A|B) is the probability that the car is blue given that the witness identified the car as blue (this is what we want to find).
P(B|A) is the probability the car will be identified as blue, given that the is blue (which is .80 (the chance of being correct))

P(B) is .15*.80 + .85*.20 or or .29

So The answer I get is (.80 * .15)/.29 or about 41%
Thomas
 
  1  
Reply Fri 15 Aug, 2008 02:19 pm
@ebrown p,
Now I really understand it. Thanks ebrown!
ebrown p
 
  1  
Reply Fri 15 Aug, 2008 02:32 pm
@Thomas,
I thought the wikipedia article (as usual) had a pretty good explanation with illustrative examples.

The drug testing example in this article is interesting... when the number of drug users in a population is small, the number of false positives will dwarf the number of true positives.

ebrown p
 
  1  
Reply Fri 15 Aug, 2008 03:11 pm
Let's try another explanation. Imagine there are 1000 cabs... each one is seen one time.

By the initial figures there will be 850 green cabs, and 150 blue cabs.

Of the 850 green cabs, 680 (i.e. 80%) will be identified as green, and 175 will be identified as blue.

of the 150 blue cabs, 30 (20%) will be identified as green, and 120 will be identified as blue.

So 175 green cabs will be identified as blue.
... and 120 blue cabs will be identified as blue.

So of the 295 cabs identified as blue, 175 will really be green and 120 will really be blue.

So the odds that a cab identified as blue really being blue is 120/295, or about 41%.

Note again, the interesting part is that since there are far more green cabs than blue... the number of false reports of blue cabs is greater than the number of true blue cabs.
0 Replies
 
Thomas
 
  1  
Reply Fri 15 Aug, 2008 09:44 pm
@ebrown p,
eBrown wrote:
The drug testing example in this article is interesting... when the number of drug users in a population is small, the number of false positives will dwarf the number of true positives.

Another interesting application is when witnesses to a crime testify to the race of the criminal. Could a significant number of minority prison inmates have been wrongly convicted by getting on the wrong side, not of federal or state law, but of Bayes's Law?
0 Replies
 
yitwail
 
  1  
Reply Fri 15 Aug, 2008 09:54 pm
@ebrown p,
couldn't have said it better myself ;-) thanks ebp
0 Replies
 
 

Related Topics

Evolution 101 - Discussion by gungasnake
Typing Equations on a PC - Discussion by Brandon9000
The Future of Artificial Intelligence - Discussion by Brandon9000
The well known Mind vs Brain. - Discussion by crayon851
Scientists Offer Proof of 'Dark Matter' - Discussion by oralloy
Blue Saturn - Discussion by oralloy
Bald Eagle-DDT Myth Still Flying High - Discussion by gungasnake
DDT: A Weapon of Mass Survival - Discussion by gungasnake
 
  1. Forums
  2. » base rate fallacy
Copyright © 2024 MadLab, LLC :: Terms of Service :: Privacy Policy :: Page generated in 0.04 seconds on 04/24/2024 at 08:11:02