[R] The L Word

Thu Feb 24 19:20:54 CET 2011

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Martin Maechler
> Sent: Thursday, February 24, 2011 7:45 AM
> To: Claudia Beleites
> Cc: r-help at r-project.org
> Subject: Re: [R] The L Word
> 
> >>>>> "CB" == Claudia Beleites <cbeleites at units.it>
> >>>>>     on Thu, 24 Feb 2011 12:31:55 +0100 writes:
> 
>     CB> On 02/24/2011 11:20 AM, Prof Brian Ripley wrote:
>     >> On Thu, 24 Feb 2011, Tal Galili wrote:
>     >> 
>     >>> Thank you all for the answers.
>     >>> 
>     >>> So if I may extend on the question -
>     >>> When is it important to use 'Literal integer'?
>     >>> Under what situations could not using it cause problems?
>     >>> Is it a matter of efficiency or precision or both?
>     >> 
>     >> Efficiency: it avoids unnecessary type conversions. For example
>     >> 
>     >> length(x) > 1
>     >> 
>     >> has to coerce the lhs to double. We have converted the base
>     >> code to use integer constants because such small efficiency
>     >> gains can add up.
>     >> 
>     >> Integer vectors can be stored more compactly than doubles, but
>     >> that is not going to help for length 1:
>     >> 
>     >>> object.size(1)
>     >> 48 bytes
>     >>> object.size(1L)
>     >> 48 bytes
>     >> (32-bit system).
>     CB> see:
> 
>     CB> n <- 0L : 100L
> 
>     CB> szi <- sapply (n, function (n) object.size (integer (n)))
>     CB> szd <- sapply (n, function (n) object.size (double (n)))
>     CB> plot (n, szd)
>     CB> points (n, szi, col = "red")
> 
> yes. 
> 
> Note however that I've never seen evidence for a *practical*
> difference in simple cases, and also of such cases as part of a
> larger computation.
> But I'm happy to see one if anyone has an interesting example.

I don't know how interesting this example is, but I use <digits>L
when combining a scalar with what I know is an integer vector so
I don't unnecessarily change its type.  Also, if I have a function
that returns an integer vector in general cases but a special value
like NA or -1 in unusual cases, I would use NA_integer_ or -1L for those
special cases so the function returns the same class of data in
all cases.  These things can be important when trying to write a
faster/better version of a builtin function, where I
want to make the new output exactly match the original.

E.g., here is a function that does exactly what sequence() does
but is about 10 times faster for long input vectors (say
seq_len(1e6)%%4L):
  Sequence.L <- function (nvec) 
  {
      seq_len(sum(nvec)) - rep(cumsum(c(0L, nvec[-length(nvec)])), nvec)
  }
If I change the 0L to 0.0 (or 0) then its result is no longer identical
to sequence's result.

(If I were designing a new data analysis language from scratch, I'd
be tempted to omit the integer type and make all numbers 64-bit doubles,
logicals 1 byte or 2 bit things, and maybe throw in some integral
types for image processing but not for general use.  32-bit integers
are pretty limiting.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> 
> E.g., I would typically never use  0L:100L  instead of 0:100
> in an R script because I think code readability (and self
> explainability) is of considerable importance too.
> 
> Martin
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>