[Rd] ifelse() woes ... can we agree on a ifelse2() ?

Mon Aug 8 12:20:27 CEST 2016

>>>>> Uwe Ligges <ligges at statistik.tu-dortmund.de>
>>>>>     on Sun, 7 Aug 2016 09:51:58 +0200 writes:

    > On 06.08.2016 17:30, Duncan Murdoch wrote:
    >> On 06/08/2016 10:18 AM, Martin Maechler wrote:

   [.................]

    >>> Of course, an ifelse2()  should also be more efficient than
    >>> ifelse() in typical "atomic" cases.
    >> 
    >> I don't think it is obvious how to make it more efficient.  ifelse()
    >> already skips evaluation of yes or no if not needed.  (An argument could
    >> be made that it would be better to guarantee evaluation of both, but
    >> it's usually easy enough to do this explicitly, so I don't see a need.)

    > Same from here: I do not see how this can easily be made more efficient, 
    > since evaluating ony parts causes a lot of copies of objects whichs 
    > slows stuff down, hence you need some complexity in yes and no to make 
    > evaluations of parts of them more efficient on R level.

Yes, Duncan and Uwe are right, and my comment "wish" above was
mostly misleading.  Some of the many small changes to ifelse()
since its initial [1998, R version 0.63.3] simple

ifelseR0633 <- function (test, yes, no)
{
    ans <- test
    test <- as.logical(test)
    nas <- is.na(test)
    ans[ test] <- rep(yes, length = length(ans))[ test]
    ans[!test] <- rep(no,  length = length(ans))[!test]
    ans[nas] <- NA
    ans
}

were exactly for adding speed in some of these cases.

    > Anyway, to solve the problem, we may want an add argument to ifelse2() 
    > that allows for specification of the type of the result (as vapply does)?

A good idea, probably only needed / desirable if we'd consider a
C based version {as vapply} but for the moment I did not want to
go there.

The current ifelse() is nice with "pre-S3" objects, such as
as atomic (named) vectors and (dimnamed) arrays, including matrices,
by keeping most attributes for those... and does that relatively
efficiently.

What I really meant, not above, but earlier when talking about
ifelse()'s inefficiency should really *not* have been related to
this thread, I'm sorry for that confusion.

I mean the fact that many many usages of ifelse() are of the
form
	ifelse(logiFn(x), f1(x), f2(x))

  {with f1() or f2() often even being constant}

and e.g.,  in the case where logiFn(x) gives few TRUEs and f1(.)
is expensive and f2(.) very cheap (say "constant" NA), it is
much more efficient to use

     ans <- x
     Y <- logiFn(x)
     ans[ Y] <- f1(x[ Y])
     ans[!Y] <- f2(x[!Y])

as the expensive function is only called on a small subset of
the full x.

I'm working at the main topic and *am* thanking Duncan
for his conceptual analysis and the (few) proposals.

Martin

    > Best,
    > Uwe

    >> Duncan Murdoch
    >> 
    >>> 
    >>> 
    >>> Thank you for your ideas and suggestions.
    >>> Again, there's no promise of implementation coming along with this
    >>> e-mail.
    >>> 
    >>> Martin Maechler
    >>> ETH Zurich