[R] Very slow: using double apply and cor.test to compute correlation p.values for 2 matrices

jim holtman jholtman at gmail.com
Wed Nov 26 15:14:40 CET 2008


Your time is being taken up in cor.test because you are calling it
100,000 times.  So grin and bear it with the amount of work you are
asking it to do.

Here I am only calling it 100 time:

> m1 <- matrix(rnorm(10000), ncol=100)
> m2 <- matrix(rnorm(10000), ncol=100)
> Rprof('/tempxx.txt')
> system.time(cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1, function(y) { cor.test(x,y)$p.value }) }))
   user  system elapsed
   8.86    0.00    8.89
>

so my guess is that calling it 100,000 times will take:  100,000 *
0.0886 seconds or about 3 hours.

If you run Rprof, you will see if is spending most of its time there:

  0   8.8 root
  1.    8.8 apply
  2. .    8.8 FUN
  3. . .    8.8 apply
  4. . . .    8.7 FUN
  5. . . . .    8.6 cor.test
  6. . . . . .    8.4 cor.test.default
  7. . . . . . .    2.4 match.arg
  8. . . . . . . .    1.7 eval
  9. . . . . . . . .    1.4 deparse
 10. . . . . . . . . .    0.6 .deparseOpts
 11. . . . . . . . . . .    0.2 pmatch
 11. . . . . . . . . . .    0.1 sum
 10. . . . . . . . . .    0.5 %in%
 11. . . . . . . . . . .    0.3 match
 12. . . . . . . . . . . .    0.3 is.factor
 13. . . . . . . . . . . . .    0.3 inherits
  8. . . . . . . .    0.2 formals
  9. . . . . . . . .    0.2 sys.function
  7. . . . . . .    2.1 cor
  8. . . . . . . .    1.1 match.arg
  9. . . . . . . . .    0.7 eval
 10. . . . . . . . . .    0.6 deparse
 11. . . . . . . . . . .    0.3 .deparseOpts
 12. . . . . . . . . . . .    0.1 pmatch
 11. . . . . . . . . . .    0.2 %in%
 12. . . . . . . . . . . .    0.2 match
 13. . . . . . . . . . . . .    0.1 is.factor
 14. . . . . . . . . . . . . .    0.1 inherits
  9. . . . . . . . .    0.1 formals
  8. . . . . . . .    0.5 stopifnot
  9. . . . . . . . .    0.2 match.call
  8. . . . . . . .    0.1 pmatch
  8. . . . . . . .    0.1 is.data.frame
  9. . . . . . . . .    0.1 inherits
  7. . . . . . .    1.5 paste
  8. . . . . . . .    1.4 deparse
  9. . . . . . . . .    0.6 .deparseOpts
 10. . . . . . . . . .    0.3 pmatch
 10. . . . . . . . . .    0.1 any
  9. . . . . . . . .    0.6 %in%
 10. . . . . . . . . .    0.6 match
 11. . . . . . . . . . .    0.5 is.factor
 12. . . . . . . . . . . .    0.4 inherits
 13. . . . . . . . . . . . .    0.2 mode
  7. . . . . . .    0.4 switch
  8. . . . . . . .    0.1 qnorm
  7. . . . . . .    0.2 pt
  5. . . . .    0.1 $

On Tue, Nov 25, 2008 at 11:55 PM, Daren Tan <daren76 at hotmail.com> wrote:
>
> My two matrices are roughly the sizes of m1 and m2. I tried using two apply and cor.test to compute the correlation p.values. More than an hour, and the codes are still running. Please help to make it more efficient.
>
> m1 <- matrix(rnorm(100000), ncol=100)
> m2 <- matrix(rnorm(10000000), ncol=100)
>
> cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1, function(y) { cor.test(x,y)$p.value }) })
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list