[R] help with calculation from dataframe with multiple entries per sample

David Winsemius dwinsemius at comcast.net
Tue Sep 18 05:51:48 CEST 2012


On Sep 17, 2012, at 7:28 PM, arun wrote:

> HI,
> Try this:
>  mydata$Gain<-rep(tapply(mydata$Mass,mydata$Sample,FUN=function(x) (x[3]-x[2])),each=length(unique(mydata$Sample)))
>  mydata
> #  Sample Time Mass Gain
> #1      1    1  3.0  0.3
> #2      1    2  3.1  0.3
> #3      1    3  3.4  0.3
> #4      2    1  4.0  0.1
> #5      2    2  4.3  0.1
> #6      2    3  4.4  0.1
> #7      3    1  3.0  0.3
> #8      3    2  3.2  0.3
> #9      3    3  3.5  0.3
> A.K.

That is going to fail as soon as there are an uneven number of rows for one value of Sample.

> Sample<-c(1,1,1,2,2,2,3,3,3,3)
> Time<-c(1,2,3,1,2,3,1,2,3, 4)
> Mass<-c(3,3.1,3.4,4,4.3,4.4,3,3.2,3.5, 3.7)
> mydata<-as.data.frame(cbind(Sample,Time,Mass))
>  mydata$Gain<-rep(tapply(mydata$Mass,mydata$Sample,FUN=function(x) (x[3]-x[2])),each=length(unique(mydata$Sample)))
Error in `$<-.data.frame`(`*tmp*`, "Gain", value = c(0.3, 0.3, 0.3, 0.100000000000001,  : 
  replacement has 9 rows, data has 10


> 
> 
> 
> 
> ----- Original Message -----
> From: Julie Lee-Yaw <julleeyaw at yahoo.ca>
> To: "r-help at r-project.org" <r-help at r-project.org>
> Cc: 
> Sent: Monday, September 17, 2012 7:15 PM
> Subject: [R] help with calculation from dataframe with multiple entries per sample
> 
> Hi 
> 
> I have a dataframe similar to:
> 
>> Sample<-c(1,1,1,2,2,2,3,3,3)
> 
>> Time<-c(1,2,3,1,2,3,1,2,3)
> 
>> Mass<-c(3,3.1,3.4,4,4.3,4.4,3,3.2,3.5)
> 
>> mydata<-as.data.frame(cbind(Sample,Time,Mass))
> 
> 
>   Sample Time Mass
> 1      1    1  3.0
> 2      1    2  3.1
> 3      1    3  3.4
> 4      2    1  4.0
> 5      2    2  4.3
> 6      2    3  4.4
> 7      3    1  3.0
> 8      3    2  3.2
> 9      3    3  3.5
> 
> where for each sample, I've measured mass at different points in time. 
> 
> I now want to calculate the difference between Mass at Time 2 and 3 for each unique Sample and store this as a new variable called "Gain2-3". So in my example three values of 0.3,0.1,0.3 would be calculated for my three unique samples and these values would be repeated in the table according to Sample. I am thus expecting:
> 
>> mydata #after adding new variable
> 
>   Sample Time MassGain2-3
> 1      1    1  3.00.3
> 2      1    2  3.1 0.3
> 3      1    3  3.4 0.3
> 4      2    1  4.0 0.1
> 5      2    2  4.3 0.1
> 6      2    3  4.4 0.1
> 7      3    1  3.0 0.3
> 8      3    2  3.2 0.3
> 9      3    3  3.5 0.3
> 
> Does anyone have any suggestions as to how to do this? I've looked at the various apply functions but I can't seem to make anything work. I'm fairly new to R and would appreciate specific suggestions. 
> 
> Thanks!
>     [[alternative HTML version deleted]]
> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

David Winsemius, MD
Alameda, CA, USA




More information about the R-help mailing list