[R] use 'lapply' to creat 2 new columns based on old ones in a data frame

Sundar Dorai-Raj sundar.dorai-raj at pdf.com
Sat Oct 13 02:23:46 CEST 2007



runner said the following on 10/12/2007 4:46 PM:
> There is a dataset 'm', which has 3 columns: 'index', 'old1' and 'old2';
> 
> I want to create 2 new columns: 'new1' and 'new2' on this condition: 
> if 'index'==i, then 'new1'='old1'+add[i].
> 'add' is a vector of numbers to be added to old columns, e.g. add=c(10,20,30
> ...)
>  
> Like this:
> 
> index	    old1	    old2	    new1	    new2
> 1	    	    5	    	    6	    	    15	    	    16
> 2	    	    5	    	    6	   	      25	    	    26
> 3	    	    5	    	    6	    	    35	    	    36
> 3	    	    50	    	   60	    	   80	    	   90
> 
> Since the actual dataset is huge, I use 'lapply'. I am able to add 1 column:
> 
> do.call(rbind, lapply( 1:nrow(m), 
>                             function(i) {m$new1[i]=m[i,2]+add[m[i,1]];
> return (m[i,])} 
>                            ))
> 
> but don't know how to do for 2 columns at the same time, sth. like this
> simply doesn't work:
> do.call(rbind,lapply(1:nrow(m), 
>                           function(i){ m$new1[i]=m[i,2]+add[m[i,1]]; 
>                                           m$new2[i]=m[i,3]+add[m[i,1]]; 
>                                           return (m[i,])}
>                          ))
> Could you please tell me how? or any other better approach?
> 
> 


No need for lapply.

x$new1 <- x$old1 + add[x$index]
x$new2 <- x$old2 + add[x$index]

To see how this works, try:

add <- c(10, 20, 30)
index <- c(1, 2, 1, 3, 1, 2, 3)
add[index]

but be careful if 'add' is length 3 and 'index' has a 4 in it you will get

index <- c(index, 4)
add[index] ## produces an 'NA'

I hope I understood your question correctly. It's happy hour on the U.S. 
east coast.

HTH,

--sundar



More information about the R-help mailing list