percentile calculation

Forums: Science And Math
Email this Topic • Print this Page

Hello

Does anyone have a knowledge of statistics that can help me with a problem i have? I am currently performing some testing at work and the data i am collecting i have estimations of what the means and standard deviations should be. I want to collect the data straight into microsoft Access and on the entry form have the percentiles calculated for each value as its entered. Does anyone know how the percentile for a value can be calculated from a known mean and standard deviation? I want to do this to form a check on the data being collected to help prevent measurment errors.

Any help on how to do this would be greatly appreciated.

Mark Edwards

Topic Stats
Top Replies
Link to this Topic

Type: Discussion • Score: 1 • Views: 1,638 • Replies: 6

No top replies

My knowledge of statistics is very ropey, Mark, but if someone can post the answer re: percentiles then I can probably help with the Access entry form calculations.

0 Replies

First of all, welcome to A2K. I hope you get a chance to look around. There are lots of good, interesting people here.

You need to first calculate the z-score of a data point. The z-score is calculated easily by the equation:

z = (data_value - mean)/std_deviation

(This puts your data so that the mean is 0, and that "z" is the number of standard deviations from the mean).

Then look the probability up on a table. A fine one is at http://techniques.geog.ox.ac.uk/mod_2/tables/z-score.htm

The percentile given on this table is the probability any random data point will be less than the given value. For negative z values you can take advantage of the fact that normal distribution is symetrical. This means the new percentile is just p'=1-p

(Unfortunately calculating "area under the normal distribution" is quite difficult. Believe me, the table is the best way to do this.)

For example if your mean is 10 and your standdard deviation is 2 let's caulculate the percentile for the value 13. The z value will be (13-10)/2 or 1.5. From the table 1.5 is .9332, so there will be a 93.32% chance of this value being less than 13.

A data value of 7 would have a negative z-value (i.e. -1.5). We can use 1-.9332 so the percentile would be .0668.

One caveat:

This only works for data that follows a normal distribution. This is true for many types of data, but not all. You need to make sure you understand the data and be confident it follows a normal distribution.

Good luck.

0 Replies

Hi

Thanks for the welcome. The method you sounds tobe exactly what i want however i don't know how to implement looking up the values in the table in Access. Is this possible. I was hoping to do it so the percentiles are calculated as the data is entered so measurments can be retaken immediately if they are extreme

Many thanks for the help

Mark

0 Replies

Alternate Solutions
Excel has a formula that returns the cumulative normal distribution of the form:

NORMSDIST(z)

where z is the Z score calculated as shown in a previous post. If Excel has it, Access probably does also.

I don't recommend using this as a sanity check. You don't want your data entry being skewed by a user trying to decide if a value of 10% is reasonable (it is). Another technique you could use is to base your sanity limits on whether or not a value is between (mean - 2 sigma) and (mean + 2 sigma). This is really easy to code since you know those values already.

I can see the value in providing a sanity check, but you are at the top of a slippery slope. If you are conducting experimentation or monitoring process performance, then your screen pre-supposes a certain result. That's not the best way to go into experimentation. Another risk you are taking is assuming the distribution is normal. If there is a variable you are not controlling, it could make your distribution non-normal. You want to check for that since it is very informative in understanding process and experimental results.

0 Replies

Mark - A few questions to assist with the Access details:

Are there multiple data-sets, each with an 'expected' mean and SD, or just the one?

When you start entering data, won't the mean and SD change rapidly for a while until it settles down to the true values? If so, how many values are needed before you can accept the mean and SD?

Do you want to recalculate the mean and SD after every new value in the dataset has been entered, then check these values against the table to see whether it is possibly erroneous?

Depending on your answers, I have a feeling it will may be easier, quicker and more accurate to input the data into Excel, and use "live" formulae to check your mean and SD values with a Vlookup table. Once you are happy with the dataset, you can then import into Access for reporting etc.

0 Replies

Hi

The measures i'm tkaing are all anthopometric measurments, the mean and standard deviation i want to use are from a previous anthropometric survey in the UK of thousands of people, while i'm only measuring about 100. I want to calculate the percentile of the people i measure in relation to this much larger study. As you have said, this seems to sound like something access may not be able to do easily so i reckon i may go with the excel route. Many thanks for all your help

Mark

0 Replies

percentile calculation

Related Topics

Quick Links

My Account

able2know