[R] Read 2 rows in 1 dataframe for diff - longitudinal data

arun smartpink111 at yahoo.com
Tue Jun 4 06:51:11 CEST 2013


If it is grouped by "subid" (that would be the difference in the number of changes)

subset(ddply(df1,.(subid),mutate,delta=c(FALSE,var[-1]!=var[-length(var)])),delta)[,-4]
#   subid year var
#3     36 2003   3
#7     47 2001   3
#9     47 2005   1
#10    47 2007   3
A.K.


----- Original Message -----
From: David Winsemius <dwinsemius at comcast.net>
To: arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Tuesday, June 4, 2013 12:37 AM
Subject: Re: [R] Read 2 rows in 1 dataframe for diff - longitudinal data


On Jun 3, 2013, at 7:10 PM, arun wrote:

> Hi,
> May be this helps:
> res1<-df1[with(df1,unlist(tapply(var,list(subid),FUN=function(x) c(FALSE,diff(x)!=0)),use.names=FALSE)),]
>  res1
> #   subid year var
> #3     36 2003   3
> #7     47 2001   3
> #9     47 2005   1
> #10    47 2007   3
> #or
> library(plyr)
>  subset(ddply(df1,.(subid),mutate,delta=c(FALSE,diff(var)!=0)),delta)[,-4]
> #   subid year var
> #3     36 2003   3
> #7     47 2001   3
> #9     47 2005   1
> #10    47 2007   3
> A.K.
> 
It's pretty simple with logical indexing:

> df1[ c(FALSE, df1$var[-1]!=df1$var[-length(df1$var)]), ]
   subid year var
3     36 2003   3
6     47 1999   1
7     47 2001   3
9     47 2005   1
10    47 2007   3


When I count the number of changes in value of var is give me 5. Not sure why you are both leaving out row 6.

-- 
David.
> 
> 
> I need to output a dataframe whenever var changes a value. 
> 
> df1 <- data.frame(subid=rep(c(36,47),each=5),year=rep(seq(1999,2007,2),2),var=c(1,1,3,3,3,1,3,3,1,3)) 
>    subid year var 
> 1     36 1999   1 
> 2     36 2001   1 
> 3     36 2003   3 
> 4     36 2005   3 
> 5     36 2007   3 
> 6     47 1999   1 
> 7     47 2001   3 
> 8     47 2003   3 
> 9     47 2005   1 
> 10    47 2007   3 
>> 
> 
> I need: 
> 36 2003   3 
> 47 2001   3 
> 47 2005   1 
> 47 2007   3 
> 
> I am trying to use ddply over subid and use the diff function, but it is not working quiet right. 
> 
>> dd <- ddply(df1,.(subid),summarize,delta=diff(var) != 0) 
>> dd 
>   subid delta 
> 1    36 FALSE 
> 2    36  TRUE 
> 3    36 FALSE 
> 4    36 FALSE 
> 5    47  TRUE 
> 6    47 FALSE 
> 7    47  TRUE 
> 8    47  TRUE 
> 
> I would appreciate any help on this. 
> Thank You! 
> -ST
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA




More information about the R-help mailing list