[R] Calculation of extremely low p-values (in lm)

Robert Baer rbaer at atsu.edu
Mon Dec 3 15:26:58 CET 2012


On 12/3/2012 6:20 AM, Sindri wrote:
> Dear R-users
>
> Please excuse me if this topic has been covered before, but I was unable to
> find anything relevant by searching
>
> I am currently doing a comparison of two biological variables that have a
> highly significant linear relationship.   I know that the p-value of linear
> regression is not so interesting in itself, but this particular value does
> raise a question.
>
> How does R calculate (extremely low) p-values for linear regression?
>
> For my data I got a p-value on the order of 10^-9 and a reviewer commented
> on this.  I tried to run the same analysis in both SAS and Sigmastat to be
> sure that I was doing it right, but both these programs only return a
> p-value of p < 0.0001
> Since I am unable to reproduce my results in another statistics program, it
> would be nice to be able to explain this unusally low p-value to the
> reviewers.
This is a matter of you understanding that the p-value is an area under 
a probability density curve.  R is simply printing out the actual area 
in a tail of some distribution.  The other statistical program is making 
the assumption that you are using the p-value to compare to a cutoff 
alpha value that is (in most fields) never set much below p<0.001.  If p 
< alpha the "hypothesis test crowd" , would choose to reject NULL  
hypothesis, so the other statistics programs take the attitude --  "why 
provide more detail?".  R chooses to give you the actual number and let 
you do what you will with it.  You could probably benefit from reviewing 
hypothesis testing in a basic statistics book if this is not clear.

Note that 10e-9 is indeed less than 0.0001, so the programs don't 
disagree.  R just provides more detail.

>
> This "problem" can be illustrated with the following made-up data:
>
> x_var<-c(0.149,0.178,0.3474,0.167,0.121,0.182,0.176,0.448,0.091,0.083,0.090,0.407,0.378,0.132,0.227,0.172,0.088,0.392,0.425,0.150,0.319,0.190,0.171,0.290,0.214,0.431,0.193)
>
> y_var<-c(0.918,0.394,0.131,0.9084,0.916,0.934,0.928,0.279,0.830,0.927,0.964,0.323,0.097,0.914,0.614,0.790,0.984,0.530,0.207,0.858,0.408,0.919,0.869,0.347,0.834,0.276,0.940)
>
> fit<-lm(y_var~x_var)
>
>> summary(fit)
> Call:
> lm(formula = y_var ~ x_var)
>
> Residuals:
>       Min       1Q   Median       3Q      Max
> -0.39152 -0.06027  0.00933  0.10024  0.22711
>
> Coefficients:
>              Estimate Std. Error t value Pr(>|t|)
> (Intercept)  1.18696    0.06394  18.562 3.90e-16 ***
> x_var       -2.25529    0.24788  -9.098 2.08e-09 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.1503 on 25 degrees of freedom
> Multiple R-squared: 0.768,	Adjusted R-squared: 0.7588
> F-statistic: 82.78 on 1 and 25 DF,  p-value: 2.083e-09
>
>
> With kind regards,
> Sindri Traustason
>
>
>
> -----
> -----------------------------------------
> Sindri Traustason
> Glostrup Hospital Ophthalmology Research Dept.
> Copenhagen, Demark
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Calculation-of-extremely-low-p-values-in-lm-tp4651823.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
__________________
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksille College of Osteopathic Medicine
A. T. Still University of Health Sciences
Kirksville, MO 63501 USA




More information about the R-help mailing list