[Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

robin hankin hankin.robin at gmail.com
Thu Sep 8 23:06:11 CEST 2016


Could we take a cue from min() and max()?

> x <- 1:10
> min(x[x>7])
[1] 8
> min(x[x>11])
[1] Inf
Warning message:
In min(x[x > 11]) : no non-missing arguments to min; returning Inf
>

As ?min says, this is implemented to preserve transitivity, and this
makes a lot of sense.
I think the issuing of a warning here is a good compromise; I can
always turn off warnings if I want.

I find this behaviour of min() and max() to be annoying in the *right*
way: it annoys me precisely when I need to be
annoyed, that is, when I haven't thought through the consequences of
sending zero-length arguments.


On Fri, Sep 9, 2016 at 6:00 AM, Paul Gilbert <pgilbert902 at gmail.com> wrote:
>
>
> On 09/08/2016 01:22 PM, Gabriel Becker wrote:
>>
>> On Thu, Sep 8, 2016 at 10:05 AM, William Dunlap <wdunlap at tibco.com> wrote:
>>
>>> Shouldn't binary operators (arithmetic and logical) should throw an error
>>> when one operand is NULL (or other type that doesn't make sense)?  This
>>> is
>>> a different case than a zero-length operand of a legitimate type.  E.g.,
>>>      any(x < 0)
>>> should return FALSE if x is number-like and length(x)==0 but give an
>>> error
>>> if x is NULL.
>>>
>> Bill,
>>
>> That is a good point. I can see the argument for this in the case that the
>> non-zero length is 1. I'm not sure which is better though. If we switch
>> any() to all(), things get murky.
>>
>> Mathematically, all(x<0) is TRUE if x is length 0 (as are all(x==0), and
>> all(x>0)), but the likelihood of this being a thought-bug on the author's
>> part is exceedingly high, imho.
>
>
> I suspect there may be more R users than you think that understand and use
> vacuously true in code. I don't really like the idea of turning a perfectly
> good and properly documented mathematical test into an error in order to
> protect against a possible "thought-bug".
>
> Paul
>
>
> So the desirable behavior seems to depend
>>
>> on the angle we look at it from.
>>
>> My personal opinion is that x < y with length(x)==0 should fail if
>> length(y)
>>>
>>> 1, at least, and I'd be for it being an error even if y is length 1,
>>
>> though I do acknowledge this is more likely (though still quite unlikely
>> imho) to be the intended behavior.
>>
>> ~G
>>
>>>
>>> I.e., I think the type check should be done before the length check.
>>>
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>>
>>> On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker <gmbecker at ucdavis.edu>
>>> wrote:
>>>
>>>> Martin,
>>>>
>>>> Like Robin and Oliver I think this type of edge-case consistency is
>>>> important and that it's fantastic that R-core - and you personally - are
>>>> willing to tackle some of these "gotcha" behaviors. "Little" stuff like
>>>> this really does combine to go a long way to making R better and better.
>>>>
>>>> I do wonder a  bit about the
>>>>
>>>> x = 1:2
>>>>
>>>> y = NULL
>>>>
>>>> x < y
>>>>
>>>> case.
>>>>
>>>> Returning a logical of length 0 is more backwards compatible, but is it
>>>> ever what the author actually intended? I have trouble thinking of a
>>>> case
>>>> where that less-than didn't carry an implicit assumption that y was
>>>> non-NULL.  I can say that in my own code, I've never hit that behavior
>>>> in
>>>> a
>>>> case that wasn't an error.
>>>>
>>>> My vote (unless someone else points out a compelling use for the
>>>> behavior)
>>>> is for the to throw an error. As a developer, I'd rather things like
>>>> this
>>>> break so the bug in my logic is visible, rather than  propagating as the
>>>> 0-length logical is &'ed or |'ed with other logical vectors, or used to
>>>> subset, or (in the case it should be length 1) passed to if() (if throws
>>>> an
>>>> error now, but the rest would silently "work").
>>>>
>>>> Best,
>>>> ~G
>>>>
>>>> On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler <
>>>> maechler at stat.math.ethz.ch>
>>>> wrote:
>>>>
>>>>>>>>>> robin hankin <hankin.robin at gmail.com>
>>>>>>>>>>     on Thu, 8 Sep 2016 10:05:21 +1200 writes:
>>>>>
>>>>>
>>>>>     > Martin I'd like to make a comment; I think that R's
>>>>>     > behaviour on 'edge' cases like this is an important thing
>>>>>     > and it's great that you are working on it.
>>>>>
>>>>>     > I make heavy use of zero-extent arrays, chiefly because
>>>>>     > the dimnames are an efficient and logical way to keep
>>>>>     > track of certain types of information.
>>>>>
>>>>>     > If I have, for example,
>>>>>
>>>>>     > a <- array(0,c(2,0,2))
>>>>>     > dimnames(a) <- list(name=c('Mike','Kevin'),
>>>>> NULL,item=c("hat","scarf"))
>>>>>
>>>>>
>>>>>     > Then in R-3.3.1, 70800 I get
>>>>>
>>>>>     a> 0
>>>>>     > logical(0)
>>>>>     >>
>>>>>
>>>>>     > But in 71219 I get
>>>>>
>>>>>     a> 0
>>>>>     > , , item = hat
>>>>>
>>>>>
>>>>>     > name
>>>>>     > Mike
>>>>>     > Kevin
>>>>>
>>>>>     > , , item = scarf
>>>>>
>>>>>
>>>>>     > name
>>>>>     > Mike
>>>>>     > Kevin
>>>>>
>>>>>     > (which is an empty logical array that holds the names of the
>>>>
>>>> people
>>>>>
>>>>> and
>>>>>     > their clothes). I find the behaviour of 71219 very much
>>>>> preferable
>>>>> because
>>>>>     > there is no reason to discard the information in the dimnames.
>>>>>
>>>>> Thanks a lot, Robin, (and Oliver) !
>>>>>
>>>>> Yes, the above is such a case where the new behavior makes much sense.
>>>>> And this behavior remains identical after the 71222 amendment.
>>>>>
>>>>> Martin
>>>>>
>>>>>     > Best wishes
>>>>>     > Robin
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>     > On Wed, Sep 7, 2016 at 9:49 PM, Martin Maechler <
>>>>> maechler at stat.math.ethz.ch>
>>>>>     > wrote:
>>>>>
>>>>>     >> >>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>>>>>     >> >>>>>     on Tue, 6 Sep 2016 22:26:31 +0200 writes:
>>>>>     >>
>>>>>     >> > Yesterday, changes to R's development version were committed,
>>>>>     >> relating
>>>>>     >> > to arithmetic, logic ('&' and '|') and
>>>>>     >> > comparison/relational ('<', '==') binary operators
>>>>>     >> > which in NEWS are described as
>>>>>     >>
>>>>>     >> > SIGNIFICANT USER-VISIBLE CHANGES:
>>>>>     >>
>>>>>     >> > [.............]
>>>>>     >>
>>>>>     >> > • Arithmetic, logic (‘&’, ‘|’) and comparison (aka
>>>>>     >> > ‘relational’, e.g., ‘<’, ‘==’) operations with arrays now
>>>>>     >> > behave consistently, notably for arrays of length zero.
>>>>>     >>
>>>>>     >> > Arithmetic between length-1 arrays and longer non-arrays had
>>>>>     >> > silently dropped the array attributes and recycled.  This
>>>>>     >> > now gives a warning and will signal an error in the future,
>>>>>     >> > as it has always for logic and comparison operations in
>>>>>     >> > these cases (e.g., compare ‘matrix(1,1) + 2:3’ and
>>>>>     >> > ‘matrix(1,1) < 2:3’).
>>>>>     >>
>>>>>     >> > As the above "visually suggests" one could think of the
>>>>> changes
>>>>>     >> > falling mainly two groups,
>>>>>     >> > 1) <0-extent array>  (op)     <non-array>
>>>>>     >> > 2) <1-extent array>  (arith)  <non-array of length != 1>
>>>>>     >>
>>>>>     >> > These changes are partly non-back compatible and may break
>>>>>     >> > existing code.  We believe that the internal consistency
>>>>> gained
>>>>>     >> > from the changes is worth the few places with problems.
>>>>>     >>
>>>>>     >> > We expect some package maintainers (10-20, or even more?) need
>>>>>     >> > to adapt their code.
>>>>>     >>
>>>>>     >> > Case '2)' above mainly results in a new warning, e.g.,
>>>>>     >>
>>>>>     >> >> matrix(1,1) + 1:2
>>>>>     >> > [1] 2 3
>>>>>     >> > Warning message:
>>>>>     >> > In matrix(1, 1) + 1:2 :
>>>>>     >> > dropping dim() of array of length one.  Will become ERROR
>>>>>     >> >>
>>>>>     >>
>>>>>     >> > whereas '1)' gives errors in cases the result silently was a
>>>>>     >> > vector of length zero, or also keeps array (dim & dimnames) in
>>>>>     >> > cases these were silently dropped.
>>>>>     >>
>>>>>     >> > The following is a "heavily" commented  R script showing (all
>>>>
>>>> ?)
>>>>>
>>>>>     >> > the important cases with changes :
>>>>>     >>
>>>>>     >> > ------------------------------------------------------------
>>>>>     >> ----------------
>>>>>     >>
>>>>>     >> > (m <- cbind(a=1[0], b=2[0]))
>>>>>     >> > Lm <- m; storage.mode(Lm) <- "logical"
>>>>>     >> > Im <- m; storage.mode(Im) <- "integer"
>>>>>     >>
>>>>>     >> > ## 1. -------------------------
>>>>>     >> > try( m & NULL ) # in R <= 3.3.x :
>>>>>     >> > ## Error in m & NULL :
>>>>>     >> > ##  operations are possible only for numeric, logical or
>>>>
>>>> complex
>>>>>
>>>>>     >> types
>>>>>     >> > ##
>>>>>     >> > ## gives 'Lm' in R >= 3.4.0
>>>>>     >>
>>>>>     >> > ## 2. -------------------------
>>>>>     >> > m + 2:3 ## gave numeric(0), now remains matrix identical to  m
>>>>>     >> > Im + 2:3 ## gave integer(0), now remains matrix identical to
>>>>> Im
>>>>>     >> (integer)
>>>>>     >>
>>>>>     >> > m > 1      ## gave logical(0), now remains matrix identical to
>>>>
>>>> Lm
>>>>>
>>>>>     >> (logical)
>>>>>     >> > m > 0.1[0] ##  ditto
>>>>>     >> > m > NULL   ##  ditto
>>>>>     >>
>>>>>     >> > ## 3. -------------------------
>>>>>     >> > mm <- m[,c(1:2,2:1,2)]
>>>>>     >> > try( m == mm ) ## now gives error   "non-conformable arrays",
>>>>>     >> > ## but gave logical(0) in R <= 3.3.x
>>>>>     >>
>>>>>     >> > ## 4. -------------------------
>>>>>     >> > str( Im + NULL)  ## gave "num", now gives "int"
>>>>>     >>
>>>>>     >> > ## 5. -------------------------
>>>>>     >> > ## special case for arithmetic w/ length-1 array
>>>>>     >> > (m1 <- matrix(1,1,1, dimnames=list("Ro","col")))
>>>>>     >> > (m2 <- matrix(1,2,1, dimnames=list(c("A","B"),"col")))
>>>>>     >>
>>>>>     >> > m1 + 1:2  # ->  2:3  but now with warning to  "become ERROR"
>>>>>     >> > tools::assertError(m1 & 1:2)# ERR: dims [product 1] do not
>>>>
>>>> match
>>>>>
>>>>> the
>>>>>     >> length of object [2]
>>>>>     >> > tools::assertError(m1 < 1:2)# ERR:                  (ditto)
>>>>>     >> > ##
>>>>>     >> > ## non-0-length arrays combined with {NULL or double() or ...}
>>>>> *fail*
>>>>>     >>
>>>>>     >> > ### Length-1 arrays:  Arithmetic with |vectors| > 1  treated
>>>>
>>>> array
>>>>>
>>>>>     >> as scalar
>>>>>     >> > m1 + NULL # gave  numeric(0) in R <= 3.3.x --- still, *but* w/
>>>>>     >> warning to "be ERROR"
>>>>>     >> > try(m1 > NULL)    # gave  logical(0) in R <= 3.3.x --- an
>>>>
>>>> *error*
>>>>>
>>>>>     >> now in R >= 3.4.0
>>>>>     >> > tools::assertError(m1 & NULL)    # gave and gives error
>>>>>     >> > tools::assertError(m1 | double())# ditto
>>>>>     >> > ## m2 was slightly different:
>>>>>     >> > tools::assertError(m2 + NULL)
>>>>>     >> > tools::assertError(m2 & NULL)
>>>>>     >> > try(m2 == NULL) ## was logical(0) in R <= 3.3.x; now error as
>>>>> above!
>>>>>     >>
>>>>>     >> > ------------------------------------------------------------
>>>>>     >> ----------------
>>>>>     >>
>>>>>     >>
>>>>>     >> > Note that in R's own  'nls'  sources, there was one case of
>>>>>     >> > situation '2)' above, i.e. a  1x1-matrix was used as a
>>>>
>>>> "scalar".
>>>>>
>>>>>     >>
>>>>>     >> > In such cases, you should explicitly coerce it to a vector,
>>>>>     >> > either ("self-explainingly") by  as.vector(.), or as I did in
>>>>>     >> > the nls case  by  c(.) :  The latter is much less
>>>>>     >> > self-explaining, but nicer to read in mathematical formulae,
>>>>
>>>> and
>>>>>
>>>>>     >> > currently also more efficient because it is a .Primitive.
>>>>>     >>
>>>>>     >> > Please use R-devel with your code, and let us know if you see
>>>>>     >> > effects that seem adverse.
>>>>>     >>
>>>>>     >> I've been slightly surprised (or even "frustrated") by the empty
>>>>>     >> reaction on our R-devel list to this post.
>>>>>     >>
>>>>>     >> I would have expected some critique, may be even some praise,
>>>>>     >> ... in any case some sign people are "thinking along" (as we say
>>>>>     >> in German).
>>>>>     >>
>>>>>     >> In the mean time, I've actually thought along the one case which
>>>>>     >> is last above:  The <op>  (binary operation) between a
>>>>>     >> non-0-length array and a 0-length vector (and NULL which should
>>>>>     >> be treated like a 0-length vector):
>>>>>     >>
>>>>>     >> R <= 3.3.1  *is* quite inconsistent with these:
>>>>>     >>
>>>>>     >>
>>>>>     >> and my proposal above (implemented in R-devel, since Sep.5)
>>>>> would
>>>>> give an
>>>>>     >> error for all these, but instead, R really could be more lenient
>>>>> here:
>>>>>     >> A 0-length result is ok, and it should *not* inherit the array
>>>>>     >> (dim, dimnames), since the array is not of length 0. So instead
>>>>>     >> of the above [for the very last part only!!], we would aim for
>>>>>     >> the following. These *all* give an error in current R-devel,
>>>>>     >> with the exception of 'm1 + NULL' which "only" gives a "bad
>>>>>     >> warning" :
>>>>>     >>
>>>>>     >> ------------------------
>>>>>     >>
>>>>>     >> m1 <- matrix(1,1)
>>>>>     >> m2 <- matrix(1,2)
>>>>>     >>
>>>>>     >> m1 + NULL #    numeric(0) in R <= 3.3.x ---> OK ?!
>>>>>     >> m1 > NULL #    logical(0) in R <= 3.3.x ---> OK ?!
>>>>>     >> try(m1 & NULL)    # ERROR in R <= 3.3.x ---> change to
>>>>> logical(0)
>>>>> ?!
>>>>>     >> try(m1 | double())# ERROR in R <= 3.3.x ---> change to
>>>>> logical(0)
>>>>> ?!
>>>>>     >> ## m2 slightly different:
>>>>>     >> try(m2 + NULL)  # ERROR in R <= 3.3.x ---> change to double(0)
>>>>
>>>> ?!
>>>>>
>>>>>     >> try(m2 & NULL)  # ERROR in R <= 3.3.x ---> change to logical(0)
>>>>
>>>> ?!
>>>>>
>>>>>     >> m2 == NULL # logical(0) in R <= 3.3.x ---> OK ?!
>>>>>     >>
>>>>>     >> ------------------------
>>>>>     >>
>>>>>     >> This would be slightly more back-compatible than the currently
>>>>>     >> implemented proposal. Everything else I said remains true, and
>>>>>     >> I'm pretty sure most changes needed in packages would remain to
>>>>
>>>> be
>>>>>
>>>>> done.
>>>>>     >>
>>>>>     >> Opinions ?
>>>>>     >>
>>>>>     >>
>>>>>     >>
>>>>>     >> > In some case where R-devel now gives an error but did not
>>>>>     >> > previously, we could contemplate giving another  "warning
>>>>>     >> > .... 'to become ERROR'" if there was too much breakage,
>>>>> though
>>>>>     >> > I don't expect that.
>>>>>     >>
>>>>>     >>
>>>>>     >> > For the R Core Team,
>>>>>     >>
>>>>>     >> > Martin Maechler,
>>>>>     >> > ETH Zurich
>>>>>     >>
>>>>>     >> ______________________________________________
>>>>>     >> R-devel at r-project.org mailing list
>>>>>     >> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>     >>
>>>>>
>>>>>
>>>>>
>>>>>     > --
>>>>>     > Robin Hankin
>>>>>     > Neutral theorist
>>>>>     > hankin.robin at gmail.com
>>>>>
>>>>>     > [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Gabriel Becker, PhD
>>>> Associate Scientist (Bioinformatics)
>>>> Genentech Research
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>
>>>
>>
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Robin Hankin
Neutral theorist
hankin.robin at gmail.com



More information about the R-devel mailing list