[R] Mapping factors to a new set of factors

james.arnold at sssc.uk.com james.arnold at sssc.uk.com
Fri Sep 11 14:48:19 CEST 2009


Thanks a lot Phil,

Recode is exactly what I was looking for. I managed to get my old function working using sapply, but the performance was horrendously slow!

One other thing was that the lvls vector can only seem to be set within the global scope of R, and local variables within a function do not seem to be able to be seen within the scope of a function that sets that variable and calls recode.

Thanks,
James

-----Original Message-----
From: Phil Spector [mailto:spector at stat.berkeley.edu] 
Sent: 08 September 2009 22:25
To: Arnold, James
Subject: Re: [R] Mapping factors to a new set of factors

James -
    If you need to do something like this, I strongly recommend
the recode function of the car package.  You can use it like this:

library(car)
recode(x,'lvls[c(1,2,13,17,20,23,27)]="North";
           lvls[c(3,5,7,14,15,24,30)] ="East";
           lvls[c(4,6,8,9,11,16,18,21,22,25,28,29,31)]="West";
           lvls[c(10,12,19,26,32)]="South";
           else="Not In Original Set"')

Including the as.factor=FALSE argument to recode will return 
a character vector -- by default it returns a factor.

                                         - Phil Spector
                                          Statistical Computing Facility
                                          Department of Statistics
                                          UC Berkeley
                                          spector at stat.berkeley.edu



On Tue, 8 Sep 2009, james.arnold at sssc.uk.com wrote:

> Hello,
>
> I am trying to map a factor variable within a data frame to a new variable whose entries are derived from the content of the original variable and there are fewer factors in the new variable. That is, I'm trying to set up a surjection.
>
> After first thinking that this would be a common operation and would have a quite simple interface, I can not seem to find one, nor any similar posts on this topic (please correct me if there is something).
>
> Therefore, I have written a function to perform this mapping. However, the function I have written doesn't seem to work with vectors greater than length 1, and as such is useless. Is there any way to ensure the function would work appropriately for each element of the vector input?
>
> mapLN <- function(x)
> {
> 	Reg <- levels(df$Var1)
> 	if (x==Reg[1] | x==Reg[2] | x==Reg[13] | x==Reg[17] | x==Reg[20] | x==Reg[23] | x==Reg[27]) {"North"} else
> 	if (x==Reg[3] | x==Reg[5] | x==Reg[7] | x==Reg[14] | x==Reg[15] | x==Reg[24] | x==Reg[30]) {"East"} else
> 	if (x==Reg[4] | x==Reg[6] | x==Reg[8] | x==Reg[9] | x==Reg[11] | x==Reg[16] | x==Reg[18] | x==Reg[21] | x==Reg[22] | x==Reg[25] | x==Reg[28] | x==Reg[29] | x==Reg[31]) {"West"} else
> 	if (x==Reg[10] | x==Reg[12] | x==Reg[19] | x==Reg[26] | x==Reg[32]) {"South"} else
> 	stop("Not in original set")
> }
>
> Many thanks,
> James
>
> This E-Mail is confidential and intended solely for the use of the individual to whom it is addressed.? If you are not the addressee, any disclosure, reproduction, copying, distribution or other dissemination or use of this communication is strictly prohibited.? If you have received this transmission in error please notify the sender immediately by replying to this e-mail, or telephone 01382 207 222, and then delete this e-mail.
>
> All outgoing messages are checked for viruses however no guarantee is given that this e-mail message, and any attachments, are free from viruses.  You are strongly recommend to check for viruses using your own virus scanner.  Neither SCRC or SSSC will accept responsibility for any damage caused as a result of virus infection.
>
>

This E-Mail is confidential and intended solely for the use of the individual to whom it is addressed.  If you are not the addressee, any disclosure, reproduction, copying, distribution or other dissemination or use of this communication is strictly prohibited.  If you have received this transmission in error please notify the sender immediately by replying to this e-mail, or telephone 01382 207 222, and then delete this e-mail.

All outgoing messages are checked for viruses however no guarantee is given that this e-mail message, and any attachments, are free from viruses.  You are strongly recommend to check for viruses using your own virus scanner.  Neither SCRC or SSSC will accept responsibility for any damage caused as a result of virus infection.




More information about the R-help mailing list