[R] drop unused levels in subset.data.frame

David Winsemius dwinsemius at comcast.net
Tue Nov 10 17:09:02 CET 2009


On Nov 10, 2009, at 10:49 AM, baptiste auguie wrote:

> Dear list,
>
> subset has a 'drop' argument that I had often mistaken for the one in
> [.factor which removes unused levels.
> Clearly it doesn't work that way, as shown below,
>
> d <- data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3]))
> s <- subset(d, y=="A", drop=TRUE)
> str(s)
> 'data.frame':	5 obs. of  2 variables:
> $ x: Factor w/ 15 levels "a","b","c","d",..: 1 4 7 10 13
> $ y: Factor w/ 3 levels "A","B","C": 1 1 1 1 1
>
> The subset still retains all the unused factor levels. I wonder how
> people usually get rid of all unused levels in a data.frame after
> subsetting? I came up with this but I may have missed a better
> built-in solution,
>
> dropit <- function (d, columns = names(d), ...)
> {
>    d[columns] = lapply(d[columns], "[", drop=TRUE, ...)
>    d
> }
>

If you are looking for a one-liner, then consider:

data.frame(lapply(s, function(x) if (is.factor(x)){ factor(x)} else  
{x}))

I added a numeric column to make sure I had not clobbered a non-factor  
variable.

 > d <- data.frame(x = factor(letters[1:15]), y =  
factor(LETTERS[1:3]), N=1:15)
 > s <- subset(d, y=="A", drop=TRUE)
 > str( data.frame(lapply(s, function(x) if (is.factor(x)){ factor(x)}  
else {x})) )
'data.frame':	5 obs. of  3 variables:
  $ x: Factor w/ 5 levels "a","d","g","j",..: 1 2 3 4 5
  $ y: Factor w/ 1 level "A": 1 1 1 1 1
  $ N: int  1 4 7 10 13


> str(dropit(s))
> 'data.frame':	5 obs. of  2 variables:
> $ x: Factor w/ 5 levels "a","d","g","j",..: 1 2 3 4 5
> $ y: Factor w/ 1 level "A": 1 1 1 1 1
>
>
> Best regards,
>
> baptiste
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list