[R] The L Word

Thu Feb 24 18:34:36 CET 2011

>>>>> "HW" == Hadley Wickham <hadley at rice.edu>
>>>>>     on Thu, 24 Feb 2011 10:14:35 -0600 writes:

    >> Note however that I've never seen evidence for a *practical*
    >> difference in simple cases, and also of such cases as part of a
    >> larger computation.
    >> But I'm happy to see one if anyone has an interesting example.
    >> 
    >> E.g., I would typically never use  0L:100L  instead of 0:100
    >> in an R script because I think code readability (and self
    >> explainability) is of considerable importance too.

    HW> But : casts to integer anyway:

    >> str(0:100)
    HW> int [1:101] 0 1 2 3 4 5 6 7 8 9 ...

Sure !!  I've been the one who had use  0:0  or 1:1  
in those rare cases integers where required (e.g. in .C(..)),
before "the L word" existed.

    HW> And performance in this case is (obviously) negligible:

    >> library(microbenchmark)
    >> microbenchmark(as.integer(c(0, 100)), times = 1000)
    HW> Unit: nanoeconds
    HW> min  lq median  uq   max
    HW> as.integer(c(0, 100)) 712 791    813 896 15840

    HW> (mainly included as opportunity to try out microbenchmark)
??
Thanks!  Did not know it.

*HOWEVER*   the above   as.integer(c(0,100))
is of course *much more* than what is internally needed to cast
the two doubles to integer.

Try this a few times ... and wonder :

  boxplot(mb2 <- microbenchmark(L = 1L:100L, 1:100, times=5000),  notch=TRUE)

  > mb2
  Unit: nanoeconds
	min  lq median  uq  max
  L     316 410    472 555 6843
  1:100 311 393    440 497 7309

the result (on my 64-bit linux) seems to indicate that  1L:100L 
takes even slightly (but significantly ["notches"]) longer.

However, using

   boxplot(mb <- microbenchmark(1:100, L = 1L:100L, times=5000), notch=TRUE)

  > mb
  Unit: nanoeconds
	min  lq median  uq   max
  1:100 296 401    469 550  9426
  L     313 396    438 496 16525

is less conclusive.. 
so, actually this is exactly one of those cases 
I do *not* see a difference, even if I look very hard.

{ BTW: There's at least one (if not two) buglet in 'microbenchmark'
  which I evaded using "L = " above :

 1) It should not use   as.character(exprs)  
    but rather          unlist(lapply(exprs, deparse))

 2) boxplot.microbenchmark should probably be more careful for
     the case when two rows have the same name (as it happens if
     I leave away "L = " above)
}

Martin