[R] Tabulation and missing values

Gavin Simpson gavin.simpson at ucl.ac.uk
Wed Oct 4 18:43:57 CEST 2006


On Thu, 2006-10-05 at 05:10 +1300, David Scott wrote:
> I think this is one for Gabor. I don't seem to be able to find my way to 
> an answer despite numerous rereadings of factor and table.
> 
> Here is a toy example:
> 
> 
> ### Some data
> EthnicCode <- c("European/Other", NA, "European/Other", "European/Other",
>                  "Pacific", "European/Other", "European/Other",
>                  "European/Other", "Maori", "Maori", "European/Other",
>                  "European/Other", "Asian", "Pacific")
> ### I don't want the categories in the default order
> ### I also want to be able to include or exclude NA
> ### argument exclude controls inclusion of NA
> table(EthnicCode)
> table(EthnicCode,exclude=NULL)
> ### Creating a factor allows reordering
> EthnicFactor <- factor(EthnicCode, exclude="")
> levels(EthnicFactor) <- list("Maori"="Maori","Pacific"="Pacific",
>                                  "Asian"="Asian",
>                                  "Europ/Other"="European/Other",
>                                      is.na="NA")
                                       ^^^^^^^^^^^

Are you sure you want that last entry. You are basically creating a
level "is.na", for those entries that have character "NA" (not same as
NA). When you use

table(EthnicFactor[,drop=TRUE])

you are dropping empty levels, not dropping NAs which is why your
self-inflicted is.na disappears, but not when you use exclude=NULL,
which I don't get either, as ?table has exclude = c(NA, NaN) not NULL.

Just do:

table(factor(EthnicCode))

but that loses the NAs in the factor created (see ?factor and argument
exclude). Compare:

dat <- factor(EthnicCode)
dat
# preserves NAs as a level
dat2 <- factor(EthnicCode, exclude = "")
dat2

table(dat)
table(dat2) # NAs as a level & displayed
table(dat2, exclude = NA) # no NAs

Is that what you were looking for?

HTH

G

> ### I can tabulate with categories in the desired order
> table(EthnicFactor[,drop=TRUE])
> ### But I can't seem to include the missing observations
> table(EthnicFactor,exclude=NULL)
> 
> David Scott

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson                 [t] +44 (0)20 7679 0522
 ECRC & ENSIS, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-help mailing list