[R] Generating new variable based on values of an existing variable

Marc Schwartz marc_schwartz at comcast.net
Mon Feb 9 21:07:04 CET 2009


on 02/09/2009 01:30 PM Josip Dasovic wrote:
> Dear R Help-Listers:
> 
> I have a problem that seems like it should have a simple solution, but I've spent hours on it (and searching the r-help archives) to no avail. What I'd like to do is to generate a new variable within a data frame, the values of which are dependent upon the values of an existing variable within that data frame. 
> 
> Assume that I have the following data:
> 
> mydf<-data.frame(region=c(rep("North", 5), rep("East", 5), rep("South", 5), rep("West", 5)))
> 
> Assume, in addition, that I have a factor vector with four values (I actually have a factor with almost two-hundred values):
> 
> element<-c("earth", "water", "air", "fire")
> 
> I would like to add a new variable to the data frame (called "element") such that the value of "element" is "earth" in each observation for which mydf$region=="North", etc. In STATA, this was relatively easy; is there a simple way to do this in R? 
> 
> This is what the final result should look like:
> 
>> mydf
>    region element
> 1   North   earth
> 2   North   earth
> 3   North   earth
> 4   North   earth
> 5   North   earth
> 6    East   water
> 7    East   water
> 8    East   water
> 9    East   water
> 10   East   water
> 11  South     air
> 12  South     air
> 13  South     air
> 14  South     air
> 15  South     air
> 16   West    fire
> 17   West    fire
> 18   West    fire
> 19   West    fire
> 20   West    fire
> 
> Thanks in advance,
> Josip

I am going to presume that unlike your example data above, the real data
may not be sequenced in unique sequential runs. Thus, a more general
approach would be to set mydf$region as a factor, with the factor levels
set to match 1:1 the sequence in 'elements':

 mydf$region <- factor(mydf$region,
                       levels = c("North", "East", "South", "West"))

 element <- c("earth", "water", "air", "fire")

# Set mydf$element to the value in 'element' which corresponds to the
# underlying factor integer code for mydf$region

  mydf$element <- element[as.numeric(mydf$region)]

> mydf
   region element
1   North   earth
2   North   earth
3   North   earth
4   North   earth
5   North   earth
6    East   water
7    East   water
8    East   water
9    East   water
10   East   water
11  South     air
12  South     air
13  South     air
14  South     air
15  South     air
16   West    fire
17   West    fire
18   West    fire
19   West    fire
20   West    fire


HTH,

Marc Schwartz




More information about the R-help mailing list