[R] Matrix efficiency in 1.9.0 vs 1.8.1

Martin Maechler maechler at stat.math.ethz.ch
Thu Apr 29 10:07:09 CEST 2004


>>>>> "Stephen" == Stephen Ellner <spe2 at cornell.edu>
>>>>>     on Wed, 28 Apr 2004 17:19:38 -0400 writes:

    Stephen> I'm seeking some advice on effectively using the
    Stephen> new Matrix library in R1.9.0 for operations with
    Stephen> large dense matrices. I'm working on integral
    Stephen> operator models (implemented numerically via matrix
    Stephen> operations) and except for the way entries are
    Stephen> generated, the examples below really are
    Stephen> representative of my problem sizes.

    Stephen> My main concern is speed of large dense matrix
    Stephen> multiplication.  In R 1.8.1 (Windows2000
    Stephen> Professional, dual AthlonMP 2800)
    >> a=matrix(rnorm(2500*2500),2500,2500); v=rnorm(2500);
    >> system.time(a%*%v);
    Stephen> [1] 0.11 0.00 0.12 NA NA

    Stephen> In R 1.9.0, same platform:
    >> a=matrix(rnorm(2500*2500),2500,2500); v=rnorm(2500);
    >> system.time(a%*%v);
    Stephen> [1] 0.24 0.00 0.25 NA NA

    <... and then you talk about the  
	 Matrix  **package** (not `library')
     ...>

Unfortunately, the 1.9.0 vs. 1.8.1 performance comparison
is not just on your computer/OS/R compilation/... version,
but I see the same phenomenon on my Linux and Solaris clients,
e.g.,

  > n <- 2500; set.seed(1); a <- matrix(rnorm(n * n), n); v <- rnorm(n)
  > gc()
	    used (Mb) gc trigger  (Mb)
  Ncells  435805 11.7     741108  19.8
  Vcells 6378701 48.7   19150535 146.2

For R 1.8.1 on an AMD Athlon

 > tmp <- gc(); system.time(for(i in 1:4) y <- a %*% v)
 [1] 0.36 0.00 0.39 0.00 0.00

For R 1.9.0 (patched):

 > tmp <- gc(); system.time(for(i in 1:4) y <- a %*% v)
 [1] 0.81 0.00 0.87 0.00 0.00
 
----

On a fast (hyper threaded) pentium 4 with 2 GB RAM, the
efficiency loss factor is even about 3 {with several runs,
showing only one here}:

R 1.8.1:

>  tmp <- gc(); system.time(for(i in 1:10) y <- a %*% v)
[1] 0.23 0.00 0.23 0.00 0.00

R 1.9.0 (patched):

> tmp <- gc(); system.time(for(i in 1:10) y <- a %*% v)
[1] 0.75 0.00 0.74 0.00 0.00

---------

So, there's definitely something we (R core) should look into.

Martin Maechler




More information about the R-help mailing list