[Rd] factor(x, exclude=y) if x is a factor

Suharto Anggono Suharto Anggono suharto_anggono at yahoo.com
Thu Dec 6 06:39:36 CET 2012

I found this part in the documentation of 'factor'.

     'factor(x, exclude=NULL)' applied to a factor is a no-operation
     unless there are unused levels: in that case, a factor with the
     reduced level set is returned.  If 'exclude' is used it should
     also be a factor with the same level set as 'x' or a set of codes
     for the levels to be excluded.

Regarding the last sentence, this is the actual behavior.

> x <- factor(c("a","b"), levels=c("a","b"))
> x
[1] a b
Levels: a b
> factor(x, exclude=factor("a", levels=c("a","b")))
[1] a b
Levels: a b
> factor(x, exclude=1L)
[1] a b
Levels: a b

I expect "a" to be removed from levels.

> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: i386-w64-mingw32/i386 (32-bit)

[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.15.2

The results are the same in R 2.5.1.

In R 2.5.1, if function 'match' did not apply 'as.character' to factor (and used internal code of factor instead), it would work to set 'exclude' as in the above quotation of the documentation. In the example above, "a" would be removed from levels.

One cause of the trouble is this code in the definition of function 'factor', in R 2.15.2 or in R 2.5.1.

    exclude <- as.vector(exclude, typeof(x))

What is the intent actually?

More information about the R-devel mailing list