[R] Vertical subtraction in dataframes

Gabor Grothendieck ggrothendieck at gmail.com
Sat Mar 13 06:21:58 CET 2010


And here is one more SQL solution. This one uses the window functions
in PostgreSQL and is slightly more compact and even closer to the pure
R version in our prior post.  We quote Count so its not regarded as
the reserved word of the same name. Note that if RpgSQL is loaded then
sqldf will automatically use PostgreSQL rather than SQLite.

library(RpgSQL)
library(sqldf)
sqldf('select rep, "Count", stain, "Count" - (sum("Count" * int4(stain
= \'none\')) over (partition by rep)) from DF')


On Fri, Mar 12, 2010 at 11:49 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> Try this ave solution noting that summing Count * (stain == "none")
> over each group gives the Count in its stain=="none" row and ave
> causes that Count value to be repeated for every row of the group so
> we get a vector that can be subtracted from Count:
>
> transform(DF, calib = Count - ave(Count * (stain == "none"), rep, FUN = sum))
>
> and here is an SQL solution which works in roughly the same way:
>
> library(sqldf)
> sqldf("select rep, Count, stain, Count - none calib from DF
>  natural join
>  (select rep, sum(Count * (stain = 'none')) none from DF group by rep)")
>
>
> On Fri, Mar 12, 2010 at 5:27 PM, Sam Albers <tonightsthenight at gmail.com> wrote:
>> Hello all,
>>
>> I have not been able to find an answer to this problem. I feel like it might
>> be so simple though that it might not get a response.
>>
>> Suppose I have a dataframe like the one I have copied below (minus the
>> 'calib' column). I wish to create a column like calib where I am subtracting
>> the 'Count' when 'stain' is 'none' from all other 'Count'  data for every
>> value of 'rep'. This is sort of analogous to putting a $ in front of the
>> number that identifies a cell in a spreadsheet environment.  Specifically I
>> need some like this:
>>
>> mydataframe$calib <- Count - (Count when stain = none for each value rep)
>>
>> Any thoughts on how I might accomplish this?
>>
>> Thanks in advance.
>>
>> Sam
>>
>> Note: I've already calculated the calib column in gnumeric for clarity.
>>
>> rep Count stain calib
>> 1 1522         none 0
>> 1 147         syto -1375
>> 1 544.8 sytolec -977.2
>> 1 2432.6 sytolec 910.6
>> 1 234.6 sytolec -1287.4
>> 2 5699.8 none 0
>> 2 265.6 syto -5434.2
>> 2 329.6 sytolec -5370.2
>> 2 383         sytolec -5316.8
>> 2 968.8 sytolec -4731
>> 3 2466.8 none 0
>> 3 1303         syto -1163.8
>> 3 1290.6 sytolec -1176.2
>> 3 110.2 sytolec -2356.6
>> 3 15086.8 sytolec 12620
>>
>> --
>> *****************************************************
>> Sam Albers
>> Geography Program
>> University of Northern British Columbia
>> 3333 University Way
>> Prince George, British Columbia
>> Canada, V2N 4Z9
>> phone: 250 960-6777
>> *****************************************************
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>



More information about the R-help mailing list