[R] dropping factor levels in subset

Marc Schwartz mschwartz at medanalytics.com
Fri Jun 27 05:35:43 CEST 2003


>-----Original Message-----
>From: r-help-bounces at stat.math.ethz.ch 
>[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Nick Bond
>Sent: Thursday, June 26, 2003 10:08 PM
>To: r-help at stat.math.ethz.ch
>Subject: [R] dropping factor levels in subset
>
>
>Dear all,
>I've taken a subset of data from a data frame using
>
>crb<-subset(all.raw, creek %in% c("CR") & year %in% 
>c(2000,2001) & substrate
>%in% ("b"))
>
>this works fine, except that all of the original factor levels are
>maintained. This results in NA's for these empty levels when I 
>try to do
>summaries based on factors using by(). Is there a simple way 
>to drop the
>factor levels that are no longer represented. I've used na.omit on
the
>results from by, but then I have to deal with the attr setting, which
>catches me too. Probably a silly question, but I've done a search and
>couldn't find anything.  Can someone help me please.
>Regards
>Nick

See ?factor for additional information, but a quick example where
using factor(old.factor) will return the factor with unused levels
dropped.

# Create a factor
> old.factor <- factor(c("One", "Two", "Three", "Four"))
> old.factor
[1] One   Two   Three Four 
Levels: Four One Three Two

# Create a subset of three noting that all four
# levels are retained
> new.factor <- old.factor[1:3]
> new.factor
[1] One   Two   Three
Levels: Four One Three Two

# Drop unused level
> new.factor2 <- factor(new.factor)
> new.factor2
[1] One   Two   Three
Levels: One Three Two


HTH,

Marc Schwartz




More information about the R-help mailing list