# [Rd] reason for odd timings

Simon Urbanek @|mon@urb@nek @end|ng |rom R-project@org
Sat Jan 22 22:44:54 CET 2022

```FWIW crossprod timings are likely not generalizable, because unlike sum() they depend on the BLAS implementation used so it's not uncommon to see significant differences. For simple things like this the parallel BLAS can actually be slower than the built-in one (since the computation is limited by memory throughput).

Also if you actually want to measure the difference, don't use functions as they have significant overhead (for short evals), just use the expressions, e.g.:

> library(microbenchmark)
> ll <- c(100, 1000, 10000, 100000, 1000000, 10000000)
> l <- lapply(ll, function(nn) {
+    set.seed(1234)
+    vv <- runif(nn)
+    rbind(microbenchmark(sum(vv^2)),
+          microbenchmark(c(crossprod(vv))))
+ })
> do.call("rbind", lapply(l, summary, unit="us"))
expr       min        lq        mean     median         uq       max neval
1         sum(vv^2)     0.703     0.731     1.09781     0.7665     0.9050    10.356   100
2  c(crossprod(vv))     2.317     2.334     3.39735     2.4025     2.5390    92.110   100
3         sum(vv^2)     3.200     3.261     3.51941     3.3580     3.5425    13.109   100
4  c(crossprod(vv))     3.919     3.973     4.29774     4.0145     4.1200    25.054   100
5         sum(vv^2)    57.461    60.910    69.12497    67.4455    75.9360   102.943   100
6  c(crossprod(vv))    18.321    18.468    19.32948    18.5735    18.6795    82.353   100
7         sum(vv^2)   286.488   566.928   607.11935   596.7610   631.5625  4955.724   100
8  c(crossprod(vv))   161.757   163.338   167.62892   165.1660   165.8425   327.439   100
9         sum(vv^2)  3221.395  3694.859  4604.01607  3867.5830  4050.5715  8565.486   100
10 c(crossprod(vv))  1898.899  1922.386  1955.75945  1941.1760  1959.3340  2392.512   100
11        sum(vv^2) 43634.025 44486.356 45493.26093 44850.8460 45170.9870 97560.633   100
12 c(crossprod(vv)) 19379.950 19631.935 19817.88547 19770.3475 19922.6035 21512.017   100

That said, I suspect that we used more time talking about this than will be ever saved by picking one method over the other ;).

Cheers,
Simon

> On Jan 23, 2022, at 6:18 AM, J C Nash <profjcnash using gmail.com> wrote:
>
> Many thanks for quick and reasonable explanation.
>
> Now the issue is to figure out when it is worthwhile to recommend/use sums2.
>
> Best, JN
>
>
> On 2022-01-21 23:38, Steve Martin wrote:
>> Just to add a bit more, stripping out most of your test shows that
>> there is one iteration (the 2nd one) that takes a lot longer than the
>> others because the sums() function gets bytecode compiled.
>> library(microbenchmark)
>> sums <- function(vv) {
>>   ss <- sum(vv^2)
>>   ss
>> }
>> sums2 <- compiler::cmpfun(sums)
>> x <- runif(100)
>>           expr    time
>> 1  sums(x)   29455
>> 2  sums(x) 3683091
>> 3 sums2(x)    7108
>> 4  sums(x)    4305
>> 5  sums(x)    2733
>> 6  sums(x)    2797
>> The paragraph on JIT in the details of ?compiler::compile explains
>> that this is the default behavior.
>> Steve
>> On Fri, 21 Jan 2022 at 20:51, J C Nash <profjcnash using gmail.com> wrote:
>>>
>>> Occasionally I run some rather trivial timings to get an idea of what might
>>> be the best way to compute some quantities.
>>>
>>> The program below gave timings for sums of squares of 100 elements much greater
>>> than those for 1000, which seems surprising. Does anyone know the cause of this?
>>>
>>> This isn't holding up my work. Just causing some head scratching.
>>>
>>> JN
>>>
>>>> source("sstimer.R")
>>>  n        t(forloop) : ratio      t(sum) : ratio         t(crossprod)    all.equal
>>> 100      38719.15  :  1.766851   13421.12  :  0.6124391          21914.21        TRUE
>>> 1000     44722.71  :  20.98348   3093.94  :  1.451648    2131.33         TRUE
>>> 10000    420149.9  :  42.10269   27341.6  :  2.739867    9979.17         TRUE
>>> 1e+05    4070469  :  39.89473    343293.5  :  3.364625   102030.2        TRUE
>>> 1e+06    42293696  :  33.27684   3605866  :  2.837109    1270965         TRUE
>>> 1e+07    408123066  :  29.20882          35415106  :  2.534612   13972596        TRUE
>>>>
>>>
>>> # crossprod timer
>>> library(microbenchmark)
>>> suml<-function(vv) {
>>>     ss<-0.0
>>>     for (i in 1:length(vv)) {ss<-ss+vv[i]^2}
>>>     ss
>>> }
>>> sums<-function(vv) {
>>>   ss<-sum(vv^2)
>>>   ss
>>> }
>>> sumc<-function(vv) {
>>>   ss<-as.numeric(crossprod(vv))
>>>   ss
>>> }
>>> ll <- c(100, 1000, 10000, 100000, 1000000, 10000000)
>>> cat(" n  \t  t(forloop) : ratio \t  t(sum) : ratio \t t(crossprod) \t all.equal \n")
>>> for (nn in ll ){
>>>    set.seed(1234)
>>>    vv <- runif(nn)
>>>    tsuml<-microbenchmark(sl<-suml(vv), unit="us")
>>>    tsums<-microbenchmark(ss<-sums(vv), unit="us")
>>>    tsumc<-microbenchmark(sc<-sumc(vv), unit="us")
>>>    ml<-mean(tsuml\$time)
>>>    ms<-mean(tsums\$time)
>>>    mc<-mean(tsumc\$time)
>>>    cat(nn,"\t",ml," : ",ml/mc,"\t",ms," : ",ms/mc,"\t",mc,"\t",all.equal(sl, ss, sc),"\n")
>>> }
>>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

```