# Statistic rooky needs help with correlation

Wed 10 Jul, 2013 05:13 am
Hi,

I have following problem and it is just the start of my analysis journey:

I have two groups which I'm comparing:
Positive and negative for a virus (pos 62, neg 35)
I'm comparing the diagnosis of these patients, there are 2 possibilities: Diagnosis A or Diagnosis B
So for the virus pos (62) I get following ratio A/B: 23/39
For virus neg (35) I get the ratio A/B: 19/16

this means a percentage of pos 37%/63% and neg 54%/46%

Diagnosis A/Diagnosis B in all cases (pos and neg (97)) is A=42 and B=55
Positive/negative in all cases (all diagnosis (97)) pos=62 and neg=35

and now I stuck, what type of analysis can I do further to see if there is any real difference? PLease give me some hints....
Lordyaswas

1
Wed 10 Jul, 2013 06:24 am
@Rine1012,
I haven't a clue, but this doesn't appear to be spam so I'll give it a bump.

Maybe a satitsitc... a stastic.... a numbers person will see this.

BUMP!
JPB

1
Wed 10 Jul, 2013 06:43 am
@Rine1012,
FIrst, what's your hypothesis? Once you've got a hypothesis you'll want to do a Chi-square analysis.

Set your data up in a 2-by-2 table.

This Link explains it pretty well.

You can report out the results of the significance of the chi-square based on the degrees of freedom for your data (see link) or you can express it in terms of an odds ratio. There are online calculators for both chi-square and odds ratio calculations. Here's one for an odds ratio http://www.vassarstats.net/odds2x2.html
engineer

1
Wed 10 Jul, 2013 07:00 am
@Rine1012,
What are you trying to understand? Are you asking whether Diagnosis A or B is better in predicting the presence of the virus?

The best way to do this is to stick it into a statistical package and generate a model. It takes five minutes and the answer is there is not a statistical difference between diagnoses A and B to a 95% confidence interval. If you want to do it manually, you should seek medical help. Ok, that was a bad pun. You can create a model that looks something like:

Virus = m * Diagnosis + b

Where Virus = 1 or 0 for yes or no, diagnosis = 1 or 0 for A or B and m and b are your standard variables for fitting a line. You can fit the line in Excel or by hand to find the slope and intercept, but what is important is the F statistic on the slope. That will show you significance. The question you need to answer is "Is this line any different than just a flat line" because a flat line means no significance at all. You can then use an online app or a statistics table to see whether that value is significant. In this case, it is not.
engineer

1
Wed 10 Jul, 2013 07:04 am
@JPB,
That's a great link! You can enter your data directly into a table there and get the Chi square statistic right away! Also a lot faster than my technique. Still same answer but a lot faster.
Lordyaswas

1
Wed 10 Jul, 2013 07:09 am
The breadth of knowledge on A2K never ceases to amaze me.

Well done, peeps.
Rine1012

1
Thu 11 Jul, 2013 06:42 am
@JPB,
Thanks that was very helpful, will dig thru your link...
