[Rd] na.omit inconsistent with is.na on list

Toby Hocking tdhock5 @end|ng |rom gm@||@com
Fri Aug 13 01:30:34 CEST 2021


Hi Gabe thanks for the feedback.

On Thu, Aug 12, 2021 at 1:19 PM Gabriel Becker <gabembecker using gmail.com>
wrote:

> Hi Toby,
>
> This definitely appears intentional, the first  expression of
> stats:::na.omit.default is
>
>    if (!is.atomic(object))
>
>         return(object)
>
> Based on this code it does seem that the documentation could be clarified
to say atomic vectors.

>
> So it is explicitly just returning the object in non-atomic cases, which
> includes lists. I was not involved in this decision (obviously) but my
> guess is that it is due to the fact that what constitutes an observation
> "being complete" in unclear in the list case. What should
>
> na.omit(list(5, NA, c(NA, 5)))
>
> return? Just the first element, or the first and the last? It seems, at
> least to me, unclear.
>
I agree in principle/theory that it is unclear, but in practice is.na has
an un-ambiguous answer (if list element is scalar NA then it is considered
missing, otherwise not).

> A small change to the documentation to to add "atomic (in the sense of
> is.atomic returning \code{TRUE})" in front of "vectors"  or similar  where
> what types of objects are supported seems justified, though, imho, as the
> current documentation is either ambiguous or technically incorrect,
> depending on what we take "vector" to mean.
>
> Best,
> ~G
>
> On Wed, Aug 11, 2021 at 10:16 PM Toby Hocking <tdhock5 using gmail.com> wrote:
>
>> Also, the na.omit method for data.frame with list column seems to be
>> inconsistent with is.na,
>>
>> > L <- list(NULL, NA, 0)
>> > str(f <- data.frame(I(L)))
>> 'data.frame': 3 obs. of  1 variable:
>>  $ L:List of 3
>>   ..$ : NULL
>>   ..$ : logi NA
>>   ..$ : num 0
>>   ..- attr(*, "class")= chr "AsIs"
>> > is.na(f)
>>          L
>> [1,] FALSE
>> [2,]  TRUE
>> [3,] FALSE
>> > na.omit(f)
>>    L
>> 1
>> 2 NA
>> 3  0
>>
>> On Wed, Aug 11, 2021 at 9:58 PM Toby Hocking <tdhock5 using gmail.com> wrote:
>>
>> > na.omit is documented as "na.omit returns the object with incomplete
>> cases
>> > removed." and "At present these will handle vectors," so I expected that
>> > when it is used on a list, it should return the same thing as if we
>> subset
>> > via is.na; however I observed the following,
>> >
>> > > L <- list(NULL, NA, 0)
>> > > str(L[!is.na(L)])
>> > List of 2
>> >  $ : NULL
>> >  $ : num 0
>> > > str(na.omit(L))
>> > List of 3
>> >  $ : NULL
>> >  $ : logi NA
>> >  $ : num 0
>> >
>> > Should na.omit be fixed so that it returns a result that is consistent
>> > with is.na? I assume that is.na is the canonical definition of what
>> > should be considered a missing value in R.
>> >
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list