# [Rd] table(exclude = NULL) always includes NA

Martin Maechler maechler at stat.math.ethz.ch
Wed Aug 17 10:28:02 CEST 2016

```>>>>> Suharto Anggono Suharto Anggono via R-devel <r-devel at r-project.org>
>>>>>     on Wed, 17 Aug 2016 03:16:52 +0000 writes:

> The quirk as in table(1:3, exclude = 1, useNA = "ifany") is actually somewhat documented, and still in R devel r71104.

yes, the documentation needs updating, too, thank you.

> In R help on 'table', in "Details" section:
> It is best to supply factors rather than rely on coercion.  In particular, ‘exclude’ will be used in coercion to a factor, and so values (not levels) which appear in ‘exclude’ before coercion will be mapped to ‘NA’ rather than be discarded.

> Another part, above it:

> ‘useNA’ controls if the table includes counts of ‘NA’ values: .... Note that levels specified in ‘exclude’ are mapped to ‘NA’ and so included in ‘NA’ counts.

> The last statement is actually not true for an argument that is already a factor.

You are right.  I plan to basically drop both these parts.
So, whereas the code got more complicated, at least the
documentation becomes simpler (because the functions behaves
more "logical").

One more thing; I plan to add this paragraph to the 'Examples:'
section :

## "pathological" case:
d.patho <- addNA(c(1,NA,1:2,1:3))[-7]; is.na(d.patho) <- 3:4
d.patho
## just 3 consecutive NA's ? --- well, have *two* kinds of NAs here :
as.integer(d.patho) # 1 4 NA NA 1 2
##
## In R >= 3.4.0, table() allows to differentiate:
table(d.patho)                   # counts the "unusual" NA
table(d.patho, useNA = "ifany")  # counts all three
table(d.patho, exclude = NULL)   #  (ditto)
table(d.patho, exclude = NA)     # counts none

If you read this and try it in R-devel (svn r >= 71101),

> table(d.patho)                   # counts the "unusual" NA
d.patho
1    2    3 <NA>
2    1    0    1
> table(d.patho, useNA = "ifany")  # counts all three
d.patho
1    2    3 <NA>
2    1    0    3
> table(d.patho, exclude = NULL)   #  (ditto)
d.patho
1    2    3 <NA>
2    1    0    3
> table(d.patho, exclude = NA)     # counts none
d.patho
1 2 3
2 1 0
>

you may find that indeed, one could desire  "more symmetry" :
Namely, we would want a way to only count the two "value-NA"s,
i.e., return the 4th possible result

> table(d.patho, ......)
d.patho
1    2    3 <NA>
2    1    0    2

>From a UI point of view, this should probably be achieved by a
forth 'useNA' option ....
but then, I'm  *not*  jumping to doing that right now
but *will* update the  table  help-page,  soon.

Martin

> --------------------------------------------
> On Tue, 16/8/16, Martin Maechler <maechler at stat.math.ethz.ch> wrote:

> Subject: Re: [Rd] table(exclude = NULL) always includes NA

> Cc: "Martin Maechler" <maechler at stat.math.ethz.ch>
> Date: Tuesday, 16 August, 2016, 5:42 PM

>>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>>>>>     on Mon, 15 Aug 2016 12:35:41 +0200 writes:

>>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>>>>>     on Mon, 15 Aug 2016 11:07:43 +0200 writes:

>>>>>     on Sun, 14 Aug 2016 03:42:08 +0000 writes:

>>>> useNA <- if (missing(useNA) && !missing(exclude) && !(NA %in% exclude)) "ifany"
>>>> An example where it change 'table' result for non-factor input, from https://stat.ethz.ch/pipermail/r-help/2005-April/069053.html :

>>>> x <- c(1,2,3,3,NA)
>>>> table(as.integer(x), exclude=NaN)

>>>> I bring the example up, in case that the change in result is not intended.

>>> Thanks a lot, Suharto.

>>> To me, the example is convincing that the change (I commited
>>> Friday), svn rev 71087 & 71088,   are a clear improvement:

>>> (As you surely know, but not all the other readers:)
>>> Before the change, the above example gave *different* results
>>> for  'x'  and  'as.integer(x)', the integer case *not* counting the NAs,
>>> whereas with the change in effect, they are the same:

>>>> x <- as.integer(dx <- c(1,2,3,3,NA))
>>>> table(x, exclude=NaN); table(dx, exclude=NaN)
>>> x
>>> 1    2    3 <NA>
>>> 1    1    2    1
>>> dx
>>> 1    2    3 <NA>
>>> 1    1    2    1
>>>>

>>> --
>>> But the change has affected 6-8 (of the 8000+) CRAN packages
>>> which I am investigating now and probably will be in contact with the
>>> package maintainers after that.

>> There has been another bug in table(), since the time  'useNA'
>> was introduced, which gives (in released R, R-patched, or R-devel):

>>> table(1:3, exclude = 1, useNA = "ifany")

>> 2    3 <NA>
>> 1    1    1
>>>

>> and that bug now (in R-devel, after my changes) triggers in
>> cases it did not previously, notably in

>> table(1:3, exclude = 1)

>> which now does set 'useNA = "ifany"' and so gives the same silly
>> result as the one above.

>> The reason for this bug is that   addNA(..)  is called (in all R
>> versions mentioned) in this case, but it should not.

>> I'm currently testing yet another amendment..

> which was not sufficient... so I had to do *much* more work.

> The result is code which functions -- I hope -- uniformly better
> than the current code, but unfortunately, code that is much longer.

> After all I came to the conclusion that using addNA() was not
> good enough [I did not yet consider *changing* addNA() itself,
> even though the only place we use it in R's own packages is
> inside table()] and so for now have code in table() that does
> an NA level or not.
> I also have extended the regression tests considerably,
> *and*  example(table)  now reverts to give identical output to
> R 3.3.1 (which it did no longer in R-devel (r 71088)).

> I'm still investigating the CRAN package fallout (from the above
> change 4 days ago) but plan to commit my (unfortunately
> somewhat extensive) changes.

> Also, I think this will become the first in this year's R-devel

> SIGNIFICANT USER-VISIBLE CHANGES:

> • ‘table()’ has been amended to be more internally consistent
> and become back compatible to R <= 2.7.2 again.
> Consequently, ‘table(1:2, exclude=NULL)’ no longer contains
> a zero count for ‘<NA>’, but ‘useNA = "always"’ continues to
> do so.

> --
> Martin

> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

```