[R] regression analysis

R. Michael Weylandt michael.weylandt at gmail.com
Thu Jul 26 16:20:03 CEST 2012


Something like that should work, though you might need to construct
the formula as a string:

paste("y ~", names(x)[i])

instead.

More worrisome is the methodology: doing 10k regressions on a single
response is almost guaranteed to give spurious results. This
methodological mistake has different names in different fields, but
it's not too hard to illustrate:

If I have 10 patients with a rare disease and a list of what each of
them had for dinner for each night over the last 20 years, it's
practically guaranteed, that on one night, perhaps 1562 days ago, they
all had fish tacos for dinner. But to conclude that fish tacos cause
my rare disease on a 4 and a quarter year lag strains credibility....

I'll let you work out the details.

Michael

On Wed, Jul 25, 2012 at 4:03 PM, Silvano Cesar da Costa <silvano at uel.br> wrote:
> Hi,
>
> I have to do 10,000 linear regression analysis, and the response variable
> (RESP) is the same for all independent variables (10,000).
>
> y ~ x[i]
>
> i = 1, ..., 10000
>
> For each analysis must extract the p-value and put them in an orderly
> increasing.
>
> I thought an analysis of the type:
>
> ana  = numeric(10000)
> for(i in 1:10000){
>  mod = lm(RESP~x[i]
>  p-value[i] = summary(mod)$coe[2,4]
>  }
>
> Could someone suggest a reading material or any suggestions, I thank you.
>
> ---------------------------------------------
> Silvano Cesar da Costa
>
> Universidade Estadual de Londrina
> Centro de Ciências Exatas
> Departamento de Estatística
>
> Fone: (43) 3371-4346
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list