[R] a complicated merging task

tathta caitlyn.paget at gmail.com
Mon Jul 20 21:07:11 CEST 2009


I would like to merge two dataframes, but i have a condition that needs to
used for the merge as well. 

the rows (observations) in each dataframe are identified by each person's ID
and by the date of the observation.  
Basically I would like it to be merged based on both ID (exact match) and
date (a condition where one dataframe's date must be after the other
dataframe's date).  


below I've given some sample dataframes to work with, described my
mysterious function, and constructed my ideas output.  


#setting up my sample dataframes
dateA <-
as.Date(c("13/01/2001","14/02/2005","17/01/2005","27/06/2006"),"%d/%m/%Y")
dateB <-
as.Date(c("22/11/2002","13/02/2005","18/08/2005","18/01/2006","21/08/2007","21/04/2009","17/05/2009","17/05/2009"),"%d/%m/%Y")
dataA <- data.frame(id=c("A","B","C","B"),date=dateA, x=11:14,y=5:2) 
dataB <-
data.frame(id=c("B","A","B","C","B","C","D","B"),date=dateB,m=27:20,n=22:29) 


#mystery function, something like:
# data.merged <- merge(dataB,dataA,by.y=(id and date, where dataB's date >
dataA's date), all.x=TRUE)


#ideal final product would look like the dataframe created by the following
5 lines
data.merged <- dataB[order(dataB$id),] 
names(data.merged) <- c("id","date.b","m","n")
data.merged$date.a <-
as.Date(c("13/01/2001","14/02/2005","14/02/2005","27/06/2006","27/06/2006","17/01/2005","17/01/2005",NA),"%d/%m/%Y")
data.merged$x <- c(11,12,12,14,14,13,13,NA)
data.merged$y <- c(5,4,4,2,2,3,3,NA)



-- 
View this message in context: http://www.nabble.com/a-complicated-merging-task-tp24575713p24575713.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list