[Rd] duplicated factor labels.

Joris Meys jorismeys at gmail.com
Fri Jun 16 09:35:35 CEST 2017

To extwnd on Martin 's explanation :

In factor(), levels are the unique input values and labels the unique
output values. So the function levels() actually displays the labels.


On 15 Jun 2017 17:15, "Martin Maechler" <maechler at stat.math.ethz.ch> wrote:

>>>>> Paul Johnson <pauljohn32 at gmail.com>
>>>>>     on Wed, 14 Jun 2017 19:00:11 -0500 writes:

    > Dear R devel
    > I've been wondering about this for a while. I am sorry to ask for your
    > time, but can one of you help me understand this?

    > This concerns duplicated labels, not levels, in the factor function.

    > I think it is hard to understand that factor() fails, but levels()
    > after does not

    >> x <- 1:6
    >> xlevels <- 1:6
    >> xlabels <- c(1, NA, NA, 4, 4, 4)
    >> y <- factor(x, levels = xlevels, labels = xlabels)
    > Error in `levels<-`(`*tmp*`, value = if (nl == nL)
    > as.character(labels) else paste0(labels,  :
    > factor level [3] is duplicated
    >> y <- factor(x, levels = xlevels)
    >> levels(y) <- xlabels
    >> y
    > [1] 1    <NA> <NA> 4    4    4
    > Levels: 1 4

    > If the latter use of levels() causes a good, expected result, couldn't
    > factor(..., labels = xlabels) be made to the same thing?

I may misunderstand, but I think you are confusing 'labels' and 'levels'
here, (and you are not alone in this!) mostly because  R's
factor() function treats them as arguments in a way that can be
confusing.. (but I don't think we'd want to change that; it's
been documented and in use for  > 25 year (in S, S+, R).

Note that after the above,

> dput(y)
structure(c(1L, NA, NA, 2L, 2L, 2L), .Label = c("1", "4"), class = "factor")

and that of course _is_ a valid factor .. which you can easily
get directly via e.g.

> identical(y, factor(c(1,NA,NA,4,4,4)))
[1] TRUE

or also  via

> identical(y, factor(c("1",NA,NA,"4","4","4")))
[1] TRUE

I really don't see a need for a change of factor().
It should remain as simple as possible (but not simpler :-).


R-devel at r-project.org mailing list

	[[alternative HTML version deleted]]

More information about the R-devel mailing list