[Rd] na.omit inconsistent with is.na on list

Gabriel Becker g@bembecker @end|ng |rom gm@||@com
Thu Aug 12 22:18:58 CEST 2021


Hi Toby,

This definitely appears intentional, the first  expression of
stats:::na.omit.default is

   if (!is.atomic(object))

        return(object)


So it is explicitly just returning the object in non-atomic cases, which
includes lists. I was not involved in this decision (obviously) but my
guess is that it is due to the fact that what constitutes an observation
"being complete" in unclear in the list case. What should

na.omit(list(5, NA, c(NA, 5)))

return? Just the first element, or the first and the last? It seems, at
least to me, unclear. A small change to the documentation to to add "atomic
(in the sense of is.atomic returning \code{TRUE})" in front of "vectors"
or similar  where what types of objects are supported seems justified,
though, imho, as the current documentation is either ambiguous or
technically incorrect, depending on what we take "vector" to mean.

Best,
~G

On Wed, Aug 11, 2021 at 10:16 PM Toby Hocking <tdhock5 using gmail.com> wrote:

> Also, the na.omit method for data.frame with list column seems to be
> inconsistent with is.na,
>
> > L <- list(NULL, NA, 0)
> > str(f <- data.frame(I(L)))
> 'data.frame': 3 obs. of  1 variable:
>  $ L:List of 3
>   ..$ : NULL
>   ..$ : logi NA
>   ..$ : num 0
>   ..- attr(*, "class")= chr "AsIs"
> > is.na(f)
>          L
> [1,] FALSE
> [2,]  TRUE
> [3,] FALSE
> > na.omit(f)
>    L
> 1
> 2 NA
> 3  0
>
> On Wed, Aug 11, 2021 at 9:58 PM Toby Hocking <tdhock5 using gmail.com> wrote:
>
> > na.omit is documented as "na.omit returns the object with incomplete
> cases
> > removed." and "At present these will handle vectors," so I expected that
> > when it is used on a list, it should return the same thing as if we
> subset
> > via is.na; however I observed the following,
> >
> > > L <- list(NULL, NA, 0)
> > > str(L[!is.na(L)])
> > List of 2
> >  $ : NULL
> >  $ : num 0
> > > str(na.omit(L))
> > List of 3
> >  $ : NULL
> >  $ : logi NA
> >  $ : num 0
> >
> > Should na.omit be fixed so that it returns a result that is consistent
> > with is.na? I assume that is.na is the canonical definition of what
> > should be considered a missing value in R.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list