[Rd] '==' operator: inconsistency in data.frame(...) == NULL

Hilmar Berger berger @end|ng |rom mp||b-ber||n@mpg@de
Sat Sep 14 13:31:27 CEST 2019


Dear all,

I did some more tests regarding the == operator in Ops.data.frame (see 
below).  All tests done in R 3.6.1 (x86_64-w64-mingw32).

I find that errors are thrown also when comparing a zero length 
data.frame to atomic objects with length>0 which should be a valid case 
according to the documentation. This can be traced to a check in the 
last line of Ops.data.frame which tests for the presence of an empty 
result value (i.e. list() ) but does not handle a list of empty values 
(i.e. list(logical(0))) which in fact is generated in those cases. There 
is a simple fix (see also below).

There are other issues with the S4 class example (i.e. data.frame() == 
<s4_object with representation as list>) which fails for different reasons.

##############################################################################

d_0 = data.frame(a = numeric(0)) # zero length data.frame
d_00 = data.frame(numeric(0)) # zero length data.frame without names
names(d_00) <- NULL # remove names to obtain value being an empty list() 
at the end of Ops.data.frame
d_3 = data.frame(a=1:3) # non-empty data.frame

m_0 = matrix(logical(0)) # zero length matrix
#------------------------
# error A:
# Error in matrix(if (is.null(value)) logical() else value, nrow = nr, 
dimnames = list(rn,  :
# length of 'dimnames' [2] not equal to array extent

d_0 == 1   # error A
d_00 == 1  # <0 x 0 matrix>
d_3 == 1   # <3 x 1 matrix>

d_0 == logical(0) # error A
d_00 == logical(0) # <0 x 0 matrix>
d_3 == logical(0) # error A

d_0 == NULL # error A
d_00 == NULL # <0 x 0 matrix>
d_3 == NULL # error A

m_0 == d_0  # error A
m_0 == d_00 # <0 x 0 matrix>
m_0 == d3   # error A

# empty matrix for comparison
m_0 == 1 # < 0 x 1 matrix>
m_0 == logical(0) # < 0 x 1 matrix>
m_0 == NULL # < 0 x 1 matrix>

# All errors above could be solved by changing the last line in 
Ops.data.frame from
# matrix(if (is.null(value)) logical() else value, nrow = nr, dimnames = 
list(rn, cn))
# to
# matrix(if (length(value)==0) logical() else value, nrow = nr, dimnames 
= list(rn, cn))
# Alternatively or in addition one could add an explicit test for 
data.frame() == NULL if desired and raise an error

#########################################################################################
# non-empty return value but failing in the same code line due to 
incompatible dimensions.
# should Ops.data.frame at all be dispatched for <data.frame> == <S4 
object> ?
setClass("FOOCLASS",
           representation("list")
)
ma = new("FOOCLASS", list(M=matrix(rnorm(300), 30,10)))
isS4(ma)
d_3 == ma # error A
##########################################################################################

Best regards,
Hilmar

Am 11/09/2019 um 13:26 schrieb Hilmar Berger:
> Sorry, I can't reproduce the example below even on the same machine. 
> However, the following example produces the same error as NULL values 
> in prior examples:
>
> > setClass("FOOCLASS",
> +          representation("list")
> + )
> > ma = new("FOOCLASS", list(M=matrix(rnorm(300), 30,10)))
> > isS4(ma)
> [1] TRUE
> > data.frame(a=1:3) == ma
> Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), 
> nrow = nr,  :
>   length of 'dimnames' [2] not equal to array extent
>
> Best,
> Hilmar
>
>
> On 11/09/2019 12:24, Hilmar Berger wrote:
>> Another example where a data.frame is compared to (here non-null, 
>> non-empty) non-atomic values in Ops.data.frame, resulting in an error 
>> message:
>>
>> setClass("FOOCLASS2",
>>          slots = c(M="matrix")
>> )
>> ma = new("FOOCLASS2", M=matrix(rnorm(300), 30,10))
>>
>> > isS4(ma)
>> [1] TRUE
>> > ma == data.frame(a=1:3)
>> Error in eval(f) : dims [product 1] do not match the length of object 
>> [3]
>>
>> As for the NULL/logical(0) cases I would suggest to explicitly test 
>> for invalid conditions in Ops.data.frame and generate a 
>> comprehensible message (e.g. "comparison is possible only for atomic 
>> and list types") if appropriate.
>>
>> Best regards,
>> Hilmar
>>
>>
>> On 11/09/2019 11:55, Hilmar Berger wrote:
>>>
>>> In the data.frame()==NULL cases I have the impression that the fact 
>>> that both sides are non-atomic is not properly detected and 
>>> therefore R tries to go on with the == method for data.frames.
>>>
>>> From a cursory check in Ops.data.frame() and some debugging I have 
>>> the impression that the case of the second argument being non-atomic 
>>> or empty is not handled at all and the function progresses until the 
>>> end, where it fails in the last step on an empty value:
>>>
>>> matrix(unlist(value, recursive = FALSE, use.names = FALSE),
>>>     nrow = nr, dimnames = list(rn, cn)) 
>>
>



More information about the R-devel mailing list