[R] selection by two unique variables

jim holtman jholtman at gmail.com
Wed May 2 18:56:31 CEST 2012


try this:

> x <- read.table(text = "id             wtdt           wt         lastpk
+
+  64050256 2010-09-18   275  2010-09-16
+
+  64050256 2010-09-19   277  2010-09-18
+
+  64050256 2010-09-20   272  2010-09-18
+
+  64050256 2010-09-21   277  2010-09-18", as.is = TRUE, header = TRUE)
>
>  first <- lapply(split(x, list(x$id, x$lastpk), drop = TRUE), function(a){
+     a[1,, drop = FALSE]
+ })
> do.call(rbind, first)
                          id       wtdt  wt     lastpk
64050256.2010-09-16 64050256 2010-09-18 275 2010-09-16
64050256.2010-09-18 64050256 2010-09-19 277 2010-09-18
>
>
>


On Wed, May 2, 2012 at 11:23 AM, Bert Gunter <gunter.berton at gene.com> wrote:
> ?tapply
>
> ?with is also useful here
>
> as in (untested)
> with(yourdataframe, tapply(lastpk, id, unique))
>
> -- Bert
>
> On Wed, May 2, 2012 at 7:58 AM, Ayyappa Chaturvedula
> <ayyappach at gmail.com> wrote:
>> Dear Group,
>>
>> I am working with a large dataset where I need to select for each unique id
>> the the unique lastpk row.  Here is a sample subject:
>>
>>          id             wtdt           wt         lastpk
>>
>>  64050256 2010-09-18   275  2010-09-16
>>
>>  64050256 2010-09-19   277  2010-09-18
>>
>>  64050256 2010-09-20   272  2010-09-18
>>
>>  64050256 2010-09-21   277  2010-09-18
>>
>>
>>
>> I want the result as:
>>
>>       id               wtdt         wt      lastpk
>>
>> 64050256 2010-09-18 275 2010-09-16
>>
>> 64050256 2010-09-19 277 2010-09-18
>>
>>
>>
>> I am using !(duplicated(data$id)) to select the first row but now I want to
>> select the first row of the unique lastpk in each unique id.
>>
>>
>>
>> I appreciate your help on this.
>>
>>
>>
>> Regards,
>>
>> Ayyappa
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.



More information about the R-help mailing list