[Rd] Time to revisit ifelse ?
Mikael Jagan
j@g@nmn2 @end|ng |rom gm@||@com
Sat Jul 12 00:16:35 CEST 2025
Thanks Ivan - I've responded in line. I'll just add here that I've put
together a single-function package and placed it in a public repository:
https://github.com/jaganmn/ifelse
Perhaps we (all) can iterate more there, opening issues as it seems that
there could be many ... ?
Mikael
On 2025-07-11 4:01 pm, Ivan Krylov wrote:
> On Fri, 11 Jul 2025 04:41:13 -0400
> Mikael Jagan <jaganmn2 using gmail.com> wrote:
>
>> But perhaps we should aim for consensus on a few issues beforehand.
>
> Thank you for raising this topic!
>
>> (Sorry if these have been discussed to death already elsewhere. In
>> that case, links to relevant threads would be helpful ...)
>
> The data.table::fifelse issue [1] comes to mind together with the vctrs
> article section about the need for a less strict ifelse() [2].
>
>> 1. Should the type and class attribute of the return value be
>> exactly the type and class attribute of c(yes[0L], no[0L]),
>> independent of 'test'? Or something else?
>
> Can we afford an escape hatch for cases when one of the ifelse()
> branches is NA or other special value handled by the '[<-' method
> belonging to the class of the other branch? data.table::fifelse() has a
> not exactly documented special case where it coerces NA_LOGICAL to the
> appropriate type, so that data.table::fifelse(runif(10) < .5,
> Sys.Date(), NA) works as intended, and dplyr::if_else also supports
> this case, but none of the other ifelses I tested do that.
>
> Can we say that if only some of the 'yes' / 'no' / 'na' arguments have
> classes, those must match and they determine the class of the return
> value? It could be convenient, and it also could be a source of bugs.
>
Right, it's quite tricky because 'c' dispatches only on its first argument,
so class(c(.Date(0), 0)) is "Date" while class(c(0, .Date(0))) is "numeric".
Hence, indeed, "commutativity" / "symmetry" (which is what users tend to
expect) would require special handling.
In a way, I like the simplicity of letting methods for 'c' handle all
coercions and clearly documenting the potential for asymmetry. The resulting
code seems easy to understand and maintain. I am wary of the Pandora's box
which is comparison of class attributes, but maybe you have something simple
in mind?
>> 2. What should be the attributes of the return value (other than
>> 'class')?
>
> data.table::fifelse (and kit::iif, which shares a lot of the code) also
> preserve the names, but neither dplyr nor hutils do. I think it would
> be reasonable to preserve the 'dim' attribute and thus the 'dimnames'
> attribute too.
>
I currently do this:
https://github.com/jaganmn/ifelse/blob/b29904f6e0f206abd677f535cd081603c5486d9c/R/ifelse1.R#L32-L47
preserving "new" attributes from 'test' (not limited to 'dim' and 'dimnames'),
notably with a bit of care where 'test' is a time series object. Does that
seem like overkill ... ?
>> 3. Should the new function be stricter and/or more verbose?
>> E.g., should it signal a condition if length(yes) or length(no) is
>> not equal to 1 nor length(test)?
>
> Leaning towards yes, but only because I haven't met any uses for
> recycling of non-length-1 inputs myself. An allow.recycle=FALSE option
> is probably overkill, right?
>
I'm a bit agnostic here. 'diag<-' allows recycling only for length-1 assignment
values, so there is a precedent. So far, I have allowed recycling
unconditionally without signaling anything, but that can easily change.
https://github.com/jaganmn/ifelse/blob/b29904f6e0f206abd677f535cd081603c5486d9c/R/ifelse1.R#L26-L29
>> 4. Should the most common case, in which neither 'yes' nor 'no'
>> has a 'class' attribute, be handled in C?
>
> This could be a very reasonable performance-correctness trade-off.
>
For the purpose of performance testing, you'll see that I've added a basic
C implementation which can be enabled with a logical argument:
https://github.com/jaganmn/ifelse/blob/b29904f6e0f206abd677f535cd081603c5486d9c/R/ifelse1.R#L9-L10
>> FWIW, my first (and untested) approximation of an ifelse2 is just
>> this:
>>
>> function (test, yes, no)
>
> I think a widely asked-for feature is a separate 'na' branch.
>
Yes, definitely a TODO.
More information about the R-devel
mailing list