Sampe size question

Forums: Statistics
Email this Topic • Print this Page

I have 4 boxes each containing 40 glass items. I am allowed to open only one box (at random) and check how many glass items in that box are broken. Let us say, that I find 3 broken items in that box. Based on this information, what can I say about the all the 160 items in the 4 boxes? Confidence level?

Topic Stats
Top Replies
Link to this Topic

Type: Question • Score: 0 • Views: 1,089 • Replies: 6

No top replies

@ddd5119,

You have a binomial distribution (broken or not broken) with a sample size of 40 and a success (not broken) rate of 37. 37/40= =.925 or 92.5% success rate. You'll need to decide what level of confidence you want around that rate before you make a judgment on the entire set of boxes. There are numerous ways to calculate the CI, but the most common one uses the normal approximation to the binomial distribution. If you want the 95% ci around .925 using a normal approximation you would use the formula:
p0 +/- (z of 1-a/2)( sqrt ((p0(1-p0)/n)) where p0 = 0.925 a=1.96 and n=40

The 95% confidence interval of 37/40 is 0.796 - 0.984. This means that you are 95% confident that the number of unbroken items in all of the boxes is between 80% and 98% of the total, or (160)(.8) to (160)(.98) = 128 to 157 unbroken items. You can convert that to broken items by subtraction if you'd rather take that approach.

Without using the CI, what do you think the overall number of broken items would be?

1 Reply

@JPB,

oops --- a = 0.05 above and (z of 1-a/2) = 1.96

1 Reply

@JPB,

Thank you for taking time to answer my question. Here, I have picked one box out of 4 boxes. So, what can I say about all the 160 items in the 4 boxes?

I was reading a little bit more about it and came across an article that seems to indicate that this is a "cluster sampling" problem: http://tinyurl.com/298cv3g

So, each of my box is a cluster. Generally, in cluster sampling, you sample a few clusters. In my case, I am sampling all the items from ONE cluster alone. So, how do apply this to my problem?

Thank you.

2 Replies

@ddd5119,

I don't agree that this is a cluster problem. You've randomly chosen one box from the four boxes available. Unless you have reason to believe that the boxes are different (clusters), you can make predictions of the breakage rate in all four boxes based on what you've observed in a single box.

The question is -- does the box you've chosen represent all boxes, or is there a reason to think they're different? If you have reason to think they're different then you have clusters. If they're not different then you have a random sample with a binomial distribution.

Is this for a class?

0 Replies

@ddd5119,

On the other hand... depending on the application (process control vs inferential statistics, for instance) you may be looking at any number of different types of problems. I wondered why you titled this a "sample size question". It really isn't about sample size... unless I'm not getting the context of the question.

1 Reply

@JPB,

The title of my question is not what I had in mind. I would simply like to compute the confidence interval for the population (4 boxes, 160 items) based on opening one box and looking at all the 40 items.

Thank you for your response. I will have to think about this one a little harder on why this is a simple binomial sample.

On your question - is there a reason to believe that each box is different? I guess not.

Thank you very much for your response.

This is not a homework problem.

0 Replies

Sampe size question

Related Topics

Quick Links

My Account

able2know