[R] Sharing levels across multiple factor vectors

hadley wickham h.wickham at gmail.com
Thu Apr 1 13:26:09 CEST 2010


On Thu, Apr 1, 2010 at 3:05 AM, Peter Dalgaard <pdalgd at gmail.com> wrote:
> Jeff Brown wrote:
>> Sorry for spamming.  I swear I had worked on that problem a long time before
>> posting.
>>
>> But I just figured it out: I have to change the values, which are
>> represented as integers, not strings.  So the following code will do it:
>>
>> df <- data.frame (
>>       a = factor( c( "bob", "alice", "bob" ) ),
>>       b = factor( c( "kenny", "alice", "alice" ) )
>> );
>> allLevels <- unique( c( levels( df$a ), levels( df$b ) ) )
>> for (c in colnames(df)) {
>>       df[,c] <- match( df[,c], allLevels);
>>       levels( df[,c] ) <- 1:(length(allLevels))
>> };
>>
>
> Hmm, I think I'd go for something like
>
> allLevels <- unique(unlist(lapply(df,levels)))
> df[] <- lapply(df, factor,
> levels=allLevels, labels=seq_along(allLevels))

This behaviour always catches me out:

levels(f) <- l
is very different to
f <- factor(f, levels = l)

Hadley


-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/



More information about the R-help mailing list