[R] T-test to check equality, unable to interpret the results.

Wed Sep 16 22:06:58 CEST 2009

I am loathe to expound basic statistics here  ... but,  at the considerable
risk of pedantry, I must note that Steve's reply below contains fundamental
errors, which I feel should not be left on this list unremarked: t-tests do
**not** test for differences in **sample** means; they test for differences
in **population** means. The sample means are different. Period.

Furthermore, the null can be other than equality -- e.g. that the mean  of
the first population is less than the second.

Finally, "statistically different" is a meaningless phrase. P <.05 means
that assuming the underlying assumptions at least "approximately hold" (and
an operational definition of "approximately hold" means is a technical
discussion unto itself), then were this calculation to be repeated over and
over again with samples of data from populations for which the null is, in
fact, true, the expected (long run) proportion of times the null will be
rejected is < .05 (the standard frequentist interpretation). For any
**particular** pairs of samples, the probability of falsely rejecting when
the null holds is either 1 or 0 -- either you rejected or not.

I would not bother with this were it not for the fact that Steve's apparent
confusion -- or at least imprecise statements -- is widespread among
scientists, in my experience, and leads to frequent misapplications and
misinterpretations of significance testing. The woes of Stat 101 training.

... But that's another diatribe...

Bert Gunter
Genentech Nonclinical Biostatistics

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Steve Lianoglou
Sent: Wednesday, September 16, 2009 12:19 PM
To: Robert Hall
Cc: r-help
Subject: Re: [R] T-test to check equality, unable to interpret the results.

Hi,

I was just going to send this when I saw Erik's post. He's right -- we  
can't say anything about your data, but we can say something about  
using a t-test.

I'm not a "real" statistician, so this answer isn't very rigorous, but  
might be helpful.

On Sep 16, 2009, at 2:55 PM, Robert Hall wrote:

> I believe the t-test checks for difference amongst the two sets, and  
> p-value
> < 0.05 means both thesets are statistically different.

A t-test is used to check if the difference in the mean of two samples  
is statistically significant.

The null hypothesis is that the two means are not different.

If you reject the null, it means you have reason to believe that the  
means of the two samples are different.

See the uses section here:

http://en.wikipedia.org/wiki/Student's_t-test

> Here while checking
> for dissimilarity the p-value is 0.3288, does it mean that higher the
> p-value (while t.test checks for dis-similarity) means more similar  
> the
> results are (which is the case above as the means of the results are  
> very
> close!)
> Please help me interpret the results..

Your intuition is essentially correct. In general, the higher the p- 
value (in any statistical test), the less confident you should be that  
rejecting the null hypothesis is a good idea.

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
   |  Memorial Sloan-Kettering Cancer Center
   |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.