[R] Difference between comma separated values in column

arun smartpink111 at yahoo.com
Tue Apr 29 11:24:55 CEST 2014



HI,

I guess this should be a bit faster.
#1st case
dat1$V3 <- lapply(seq_along(dat1$V2),function(i) c(dat1$V2[[i]][-1] - head(dat1$V1[[i]],-1), tail(dat1$V1[[i]],1)))
#2nd case
dat2$V3 <- unlist(lapply(seq_along(lst1[,2]),function(i) paste(c(lst1[,2][[i]][-1] - head(lst1[,1][[i]], -1), tail(lst1[,1][[i]],1)),collapse=",")))

A.K.



On Tuesday, April 29, 2014 3:58 AM, arun <smartpink111 at yahoo.com> wrote:
Hi,

It is better to show the example data using ?dput().  Here, it is not clear whether the columns are character columns or lists.
##If it is the latter case

dat1 <- data.frame(V1=I(list(1:3, c(1,2,4), c(2,3,4,5))), V2= I(list(c(3,6,5), c(7,10,9), 2:5)))
 dat1$V3 <- mapply(`c`,mapply(`-`, lapply(dat1$V2, `[`,-1), lapply(dat1$V1,head,-1)), lapply(dat1$V1,tail,1))
 dat1
#          V1         V2         V3
#1    1, 2, 3    3, 6, 5    5, 3, 3
#2    1, 2, 4   7, 10, 9    9, 7, 4
#3 2, 3, 4, 5 2, 3, 4, 5 1, 1, 1, 5


##If the columns are character vectors.

dat2 <- structure(list(V1 = c("1,2,3", "1,2,4", "2,3,4,5"), V2 = c("3,6,5", 
"7,10,9", "2,3,4,5")), .Names = c("V1", "V2"), row.names = c(NA, 
-3L), class = "data.frame")
 lst1 <- sapply(dat2, function(x) lapply(strsplit(x, split=","),as.numeric))
dat2$V3 <- unlist(lapply(mapply(`c`,mapply(`-`, lapply(lst1[,2],`[`, -1), lapply(lst1[,1], head,-1)), lapply(lst1[,1], tail,1)), paste, collapse=","))
 dat2
#       V1      V2      V3
#1   1,2,3   3,6,5   5,3,3
#2   1,2,4  7,10,9   9,7,4
#3 2,3,4,5 2,3,4,5 1,1,1,5


A.K.


 Hi,

I have a quick question in R. I have dataframe with two columns with multiple values separated by comma.
Example:
           
        V1           V2
1    1, 2, 3      3, 6, 5
2    1, 2, 4      7, 10, 9
3    2, 3, 4, 5   2, 3, 4, 5

I want to calculate the difference between both the column.

Expected results (suppose results are stored in V3) - it is basically subtracting (n-th) value of the column1 from  (n-th + 1) value of column2.
       
        V3          
1    6-1, 5-2, 3      
2    10-1, 9-2, 4    
3    3-2, 4-3, 5-4, 5

which gives    (Last value doesn't matter)
     
       V3          
1    5, 3, 3      
2    9, 7, 4    
3    1, 1, 1, 5

Would greatly appreciate if anyone can suggest how can I proceed?



More information about the R-help mailing list