[R] help with calculation from dataframe with multiple entries per sample

William Dunlap wdunlap at tibco.com
Tue Sep 18 17:12:12 CEST 2012


The following works even when the input data frame has its rows
scrambled.  It does not currently check that there is exactly one entry
in each sample for Time==2 and Time==3.

within(mydata, 
            `Gain2-3` <- ave(seq_along(Sample),
                                          Sample,
                                          FUN=function(i) {
                                             L2 <- Time[i]==2
                                             L3 <- Time[i]==3
                                             Mass[i][L3] - Mass[i][L2] }))

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of arun
> Sent: Monday, September 17, 2012 8:12 PM
> To: Julie Lee-Yaw
> Cc: R help
> Subject: Re: [R] help with calculation from dataframe with multiple entries per sample
> 
> HI,
> Modified version of my earlier solution:
> res1<-tapply(mydata$Mass,mydata$Sample,FUN=function(x) (x[3]-x[2]))
> res2<-data.frame(Sample=names(res1),Gain2_3=res1)
>  merge(mydata,res2)
> 
> #Sample Time Mass Gain2_3
> #1      1    1  3.0     0.3
> #2      1    2  3.1     0.3
> #3      1    3  3.4     0.3
> #4      2    1  4.0     0.1
> #5      2    2  4.3     0.1
> #6      2    3  4.4     0.1
> #7      3    1  3.0     0.3
> #8      3    2  3.2     0.3
> #9      3    3  3.5     0.3
> A.K.
> 
> 
> 
> ----- Original Message -----
> From: Julie Lee-Yaw <julleeyaw at yahoo.ca>
> To: "r-help at r-project.org" <r-help at r-project.org>
> Cc:
> Sent: Monday, September 17, 2012 7:15 PM
> Subject: [R] help with calculation from dataframe with multiple entries per sample
> 
> Hi
> 
> I have a dataframe similar to:
> 
> >Sample<-c(1,1,1,2,2,2,3,3,3)
> 
> >Time<-c(1,2,3,1,2,3,1,2,3)
> 
> >Mass<-c(3,3.1,3.4,4,4.3,4.4,3,3.2,3.5)
> 
> >mydata<-as.data.frame(cbind(Sample,Time,Mass))
> 
> 
>   Sample Time Mass
> 1      1    1  3.0
> 2      1    2  3.1
> 3      1    3  3.4
> 4      2    1  4.0
> 5      2    2  4.3
> 6      2    3  4.4
> 7      3    1  3.0
> 8      3    2  3.2
> 9      3    3  3.5
> 
> where for each sample, I've measured mass at different points in time.
> 
> I now want to calculate the difference between Mass at Time 2 and 3 for each unique
> Sample and store this as a new variable called "Gain2-3". So in my example three values
> of 0.3,0.1,0.3 would be calculated for my three unique samples and these values would
> be repeated in the table according to Sample. I am thus expecting:
> 
> >mydata #after adding new variable
> 
>   Sample Time MassGain2-3
> 1      1    1  3.00.3
> 2      1    2  3.1 0.3
> 3      1    3  3.4 0.3
> 4      2    1  4.0 0.1
> 5      2    2  4.3 0.1
> 6      2    3  4.4 0.1
> 7      3    1  3.0 0.3
> 8      3    2  3.2 0.3
> 9      3    3  3.5 0.3
> 
> Does anyone have any suggestions as to how to do this? I've looked at the various apply
> functions but I can't seem to make anything work. I'm fairly new to R and would
> appreciate specific suggestions.
> 
> Thanks!
>     [[alternative HTML version deleted]]
> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list