[R] assign factor levels based on list

David Winsemius dwinsemius at comcast.net
Wed Feb 9 22:32:26 CET 2011


On Feb 9, 2011, at 4:18 PM, David Winsemius wrote:

>
> On Feb 9, 2011, at 3:44 PM, Tim Howard wrote:
>
>> All,
>>
>> Given a data frame and a list containing factor definitions for  
>> certain columns, how can I apply those definitions from the list,  
>> rather than doing it the standard way, as noted below. I'm lost in  
>> the world of do.call, assign, paste, and can't find my way through.  
>> For example:
>>
>> #set up df
>> y <- data.frame(colOne = c(1,2,3), colTwo =  
>> c("apple","pear","orange"))
>>
>> factor.defs <- list(colOne = list(name = "colOne",
>> lvl = c(1,2,3,4,5,6)),
>> colTwo = list(name = "colTwo",
>> lvl = c("apple","pear","orange","fig","banana")))
>>
>> #A standard way to define levels
>> y$colTwo <- factor(y$colTwo , levels =  
>> c("apple","pear","orange","fig","banana"))
>
> Here's a one item way of using factor.defs. I thought it would be  
> pretty easy to loop through it with lapply or do.call, but it's not  
> immediately obvious once I get down to the nitty gritty.
>
> > y[factor.defs[[1]]$name] <- factor(y[[factor.defs[[1]]$name]] ,  
> levels= factor.defs[[1]]$lvl)
> > y
>  colOne colTwo
> 1      1  apple
> 2      2   pear
> 3      3 orange
>
> levels(y$colOne)
> #[1] "1" "2" "3" "4" "5" "6"
>
> Note the different uses of "[" and "[[" on each side of the  
> assignment.
>
> This works on your example,  but I don't think it would leave the  
> non-targeted columns in place
>
> y <- as.data.frame( lapply(factor.defs, function(x) { y[[x$name]] <-  
> factor(y[[x$name]] , levels= x$lvl) } ) )
> y
>  colOne colTwo
> 1      1  apple
> 2      2   pear
> 3      3 orange
>
> I wonder if I could leave out the as.data.frame part and make an  
> assignment in the parent.frame instead?
>
>  y <- within(y, lapply(factor.defs, function(x) { y[[x$name]] <-  
> factor(y[[x$name]] , levels= x$lvl) } ) )
> y
>  colOne colTwo
> 1      1  apple
> 2      2   pear
> 3      3 orange
>
> Looks promising. You should construct a more complex test set and  
> report back.

Didn't succeed (no factor levels modified), but this seems to:

y <- data.frame(colOne = c(1,2,3), colTwo = c("apple","pear","orange"),
                colThree=c(4,5,6) )

factor.defs <- list(colOne = list(name = "colOne",
                                   lvl = c(1,2,3,4,5,6)),
                     colTwo = list(name = "colTwo",
                          lvl =  
c("apple","pear","orange","fig","banana")))

y[ , names(factor.defs)] <- lapply(factor.defs, function(x) {
                       y[[x$name]] <- factor(y[[x$name]] , levels= x 
$lvl) } )
y
   colOne colTwo colThree
1      1  apple        4
2      2   pear        5
3      3 orange        6
 > str(y)
'data.frame':	3 obs. of  3 variables:
  $ colOne  : Factor w/ 6 levels "1","2","3","4",..: 1 2 3
  $ colTwo  : Factor w/ 5 levels "apple","pear",..: 1 2 3
  $ colThree: num  4 5 6


> -- 
> David.
>
>>
>> # I'd like to use the definitions locally but also pass them (but  
>> not the data) to a function,
>> # so, rather than defining each manually each time, I'd like to  
>> loop through the columns,
>> # call them by name, find the definitions in the list and use them  
>> from there. Before I try to loop
>> # or use some form of apply, I'd like to get a single factor  
>> definition working.
>>
>> # this doesn't seem to see the dataframe properly
>> do.call(factor,list((paste("y$",factor.defs[2][[1]] 
>> $name,sep="")),levels=factor.defs[2][[1]]$lvl))
>>
>> #adding "as.name" doesn't help
>> do.call(factor,list(as.name(paste("y$",factor.defs[2][[1]] 
>> $name,sep="")),levels=factor.defs[2][[1]]$lvl))
>>
>> #Here's my attempt to mimic the standard way, using assign. Ha!  
>> what a joke.
>> assign(as.name(paste("y$",factor.defs[2][[1]]$name,sep="")),
>>   do.call(factor, list(as.name(paste("y$",factor.defs[2][[1]] 
>> $name,sep="")),
>>   levels = factor.defs[2][[1]]$lvl)))
>> ##Error in function (x = character(), levels, labels = levels,  
>> exclude = NA,  :
>> ##  object 'y$colTwo' not found
>> Any help or perspective (or better way from the beginning!) would  
>> be greatly appreciated.
>> Thanks in advance!
>> Tim
>>
>>

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list