[R] levels of factor

Marc Schwartz MSchwartz at MedAnalytics.com
Tue Aug 17 17:19:40 CEST 2004

On Tue, 2004-08-17 at 09:30, Luis Rideau Cruz wrote:
> R-help,
> I have a data frame wich I subset like :
> a <- subset(df,df$"column2" %in% c("factor1","factor2")  & df$"column2"==1)
> But when I type levels(a$"column2") I still get the same levels as in df (my original data frame)
> Why is that?

The default for [.factor is:

x[i, drop = FALSE]

Hence, unused factor levels are retained.

> Is it right?


If you want to explicitly recode the factor based upon only those levels
that are actually in use, you can do something like the following:

a <- factor(a)

However, I am a bit unclear as to the logic of the subset statement that
you are using, perhaps b/c I don't know what your data is.

You seem to be subsetting 'column2' on both the factor levels and a
presumed numeric code. Is that really what you want to do?

You might want to review the "Warning" section in ?factor

BTW, when using subset(), the evaluation takes place within the data
frame, so you do not need to use df$"column2" in the function call. You
can just use column2, for example:

subset(df, column2 %in% c("factor1", "factor2"))

See ?factor and ?"[.factor" for more information.


Marc Schwartz

More information about the R-help mailing list