[Rd] the incredible lightness of crossprod

Paul Gilbert pgilbert at bank-banque-canada.ca
Thu Jan 27 22:55:24 CET 2005



Prof Brian Ripley wrote:

> On Thu, 27 Jan 2005, Paul Gilbert wrote:
>
>> A few weeks ago I noticed
>>
>>>  z <- matrix(rnorm(20000),10000,2)
>>
>>
>>> system.time(for (i in 1:1000) apply(z,2,sum))
>>
>> [1] 13.44  0.48 14.08  0.00  0.00
>>
>>> system.time(for (i in 1:1000) rep(1,10000) %*% z)
>>
>> [1] 6.46 0.11 6.84 0.00 0.00
>
>
> So both run in a few milliseconds for rather large problems.
>
>> which seemed  completely contrary to all my childhood teachings. Now
>>
>>> system.time(for (i in 1:1000) crossprod(rep(1,10000), z))
>>
>> [1] 1.90 0.12 2.24 0.00 0.00
>>
>> makes sense because it is suppose to be faster than %*% , but why is 
>> apply so slow?
>
>
> `so slow' sic: what are you going to do in the 7ms you saved? 

Yes, I think I've spent more  time checking this than I will ever  save.

>> (And should I go back and change apply in my code everywhere or is 
>> this likely to reverse again?)
>
>
> It's not likely.  apply is an R-level loop, and  %*% is a C-level one.
> However,  %*% is not supposed to be much slower than crossprod, and 
> the devil is in the details of how the BLAS is implemented: the code 
> is very similar.
>
> That %*% was faster than apply has been true in all my (17 years) of 
> S/R experience.  Your childhood may predate S3, of course.

I didn't say the teachings were correct. I clearly should have 
questioned some things sooner.

> I still think one should use row/colSums for clarity with acceptable
> efficiency.  It must be very unusual for these operations to be a 
> dominant part of a calculation, so let's not lose proportion here. 

Agreed. I think I like the clarity reason best.

Thanks for the explanation.

Paul



More information about the R-devel mailing list