[R] Vertical subtraction in dataframes

David Winsemius dwinsemius at comcast.net
Sat Mar 13 03:59:00 CET 2010


On Mar 12, 2010, at 5:27 PM, Sam Albers wrote:

> Hello all,
>
> I have not been able to find an answer to this problem. I feel like  
> it might
> be so simple though that it might not get a response.
>
> Suppose I have a dataframe like the one I have copied below (minus the
> 'calib' column). I wish to create a column like calib where I am  
> subtracting
> the 'Count' when 'stain' is 'none' from all other 'Count'  data for  
> every
> value of 'rep'. This is sort of analogous to putting a $ in front of  
> the
> number that identifies a cell in a spreadsheet environment.   
> Specifically I
> need some like this:
>
> mydataframe$calib <- Count - (Count when stain = none for each value  
> rep)
>
> Any thoughts on how I might accomplish this?
>
> Thanks in advance.
>
> Sam
>
> Note: I've already calculated the calib column in gnumeric for  
> clarity.
>
> rep Count stain calib
> 1 1522         none 0
> 1 147         syto -1375
> 1 544.8 sytolec -977.2
> 1 2432.6 sytolec 910.6
> 1 234.6 sytolec -1287.4
> 2 5699.8 none 0
> 2 265.6 syto -5434.2
> 2 329.6 sytolec -5370.2
> 2 383         sytolec -5316.8
> 2 968.8 sytolec -4731
> 3 2466.8 none 0
> 3 1303         syto -1163.8
> 3 1290.6 sytolec -1176.2
> 3 110.2 sytolec -2356.6
> 3 15086.8 sytolec 12620

This method does not depend on the ordering which I believe both  
solutions so far do require (but it may fail if there is more than one  
value satisfying the stain=="none" test). It is an example of what  
Spector calls split-apply-bind logic.  See below:

 > dfrm$calib2 <- unlist( lapply(split(dfrm, dfrm$rep),
                 function(x) x$calib <- x$Count- x[x$stain == "none",  
"Count"]) )
 > dfrm
    repp   Count   stain   calib  calib2
1     1  1522.0    none     0.0     0.0
2     1   147.0    syto -1375.0 -1375.0
3     1   544.8 sytolec  -977.2  -977.2
4     1  2432.6 sytolec   910.6   910.6
5     1   234.6 sytolec -1287.4 -1287.4
6     2  5699.8    none     0.0     0.0
7     2   265.6    syto -5434.2 -5434.2
8     2   329.6 sytolec -5370.2 -5370.2
9     2   383.0 sytolec -5316.8 -5316.8
10    2   968.8 sytolec -4731.0 -4731.0
11    3  2466.8    none     0.0     0.0
12    3  1303.0    syto -1163.8 -1163.8
13    3  1290.6 sytolec -1176.2 -1176.2
14    3   110.2 sytolec -2356.6 -2356.6
15    3 15086.8 sytolec 12620.0 12620.0

 > dfrm[3,3] <-"none"
 > dfrm$calib2 <- unlist( lapply(split(dfrm, dfrm$rep), function(x) x 
$calib <- x$Count- x[x$stain=="none", "Count"]) )
Warning message:
In x$Count - x[x$stain == "none", "Count"] :
   longer object length is not a multiple of shorter object length
>
--

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list