[R] function to filter identical data.fames using less than (<) and greater than (>)

Karl Brand k.brand at erasmusmc.nl
Thu Dec 6 17:35:01 CET 2012


Hi Jeff,

Subset is indeed what's reuqired here. But using it every time it's 
needed was generating excessive amounts of obtuse code. So for the sake 
of clarity and convenience i wanted a wrapper function to replace these 
repetitious subsets.

Although Rui's example works just fine, love to see any idiomatic ways 
you might attempt this (also for the sake of improving my grasp of R).

Cheers,

Karl




On 06/12/12 15:57, Jeff Newmiller wrote:
> You have not indicated why the subset function is insufficient for your needs...
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
>                                        Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> Karl Brand <k.brand at erasmusmc.nl> wrote:
>
>> Esteemed UseRs,
>>
>> I've got many biggish data frames which need a lot subsetting, like in
>> this example:
>>
>> # example
>> eg <- data.frame(A = rnorm(10), B = rnorm(10), C = rnorm(10), D =
>> rnorm(10))
>> egsub <- eg[eg$A < 0 & eg$B < 1 & eg$C > 0, ]
>> egsub
>> egsub2 <- eg[eg$A > 1 & eg$B > 0, ]
>> egsub2
>>
>> # To make this clearer than 1000s of lines of extractions with []
>> # I tried to make a function like this:
>>
>> # func(data="eg", A="< 0", B="< 1", C="> 0")
>>
>> # Which would also need to be run as
>>
>> # func(data="eg", A="> 1", B="> 0", C=NA)
>> #end
>>
>> Noteably:
>> -the signs* "<" and ">" need to be flexible _and_ optional
>> -the quantities also need to be flexible
>> -column header names i.e, A, B and C don't need flexibility,
>> i.e., can remain fixed
>> * "less than" and "greater than" so google picks up this thread
>>
>> Once again i find just how limited my grasp of R is...Is do.call() the
>> best way to call binary operators like < & > in a function? Is an
>> ifelse
>> statement needed for each column to make filtering on it optional?
>> etc....
>>
>> Any one with the patience to show their working version of such a
>> funciton would receive my undying Rdulation. With thanks in advance,
>>
>> Karl
>

-- 
Karl Brand
Dept of Cardiology and Dept of Bioinformatics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
T +31 (0)10 703 2460 |M +31 (0)642 777 268 |F +31 (0)10 704 4161




More information about the R-help mailing list