[R] Is there a fast way to do several hundred thousand ANOVA tests?

Benilton Carvalho bcarvalh at jhsph.edu
Mon Aug 24 18:43:05 CEST 2009


have you tried:

fits <- lm(a~b)
fstat <- sapply(summary(fits), function(x) x[["fstatistic"]][["value"]])

it takes 3secs for 100K columns on my machine (running on batt)

b

On Aug 23, 2009, at 9:55 PM, big permie wrote:

> Dear R users,
>
> I have a matrix a and a classification vector b such that
>
>> str(a)
> num [1:50, 1:800000]
> and
>> str(b)
> Factor w/ 3 levels "cond1","cond2","cond3"
>
> I'd like to do an anova on all 800000 columns and record the F  
> statistic for
> each test; I currently do this using
>
> f.stat.vec <- numeric(length(a[1,])
>
> for (i in 1:length(a[1,]) {
>  f.test.frame <- data.frame(nums = a[,i], cond = b)
>  aov.vox <- aov(nums ~ cond, data = f.test.frame)
>  f.stat <- summary(aov.vox)[[1]][1,4]
>  f.stat.vec[i] <- f.stat
> }
>
> The problem is that this code takes about 70 minutes to run.
>
> Is there a faster way to do an anova & record the F stat for each  
> column?
>
> Any help would be appreciated.
>
> Thanks
> Heath
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list