[R] How to subset a data frame to include only first events

arun smartpink111 at yahoo.com
Tue Feb 5 14:30:00 CET 2013


HI,

If the  `Date` column is not ordered:
Date1=as.Date(c("01/05/2012","01/07/2012","01/15/2012","01/09/2012","01/14/2012","01/25/2012",
"01/08/2012","01/24/2012","01/03/2012"),format="%m/%d/%Y")

dat1<-data.frame(ID=rep(1:3,each=3),Date1)
 aggregate(Date1~ID,data=dat1,function(x) min(x))
#  ID      Date1
#1  1 2012-01-05
#2  2 2012-01-09
#3  3 2012-01-03

#If it is ordered:

Date2=as.Date(c("01/05/2012","01/07/2012","01/15/2012","01/09/2012","01/14/2012","01/25/2012",

"01/03/2012","01/08/2012","01/24/2012"),format="%m/%d/%Y")
dat2<- data.frame(ID=rep(1:3,each=3),Date2)
 aggregate(Date2~ID,data=dat2,head,1)
 # ID      Date2
#1  1 2012-01-05
#2  2 2012-01-09
#3  3 2012-01-03
A.K.

----- Original Message -----
From: Dylan Arena <darena at stanford.edu>
To: r-help at r-project.org
Cc: 
Sent: Monday, February 4, 2013 9:29 PM
Subject: [R] How to subset a data frame to include only first events

Hi there,


I have data frame with columns ID and Date.  There are multiple rows for
each ID, but I only want to keep the *first* such row--i.e., the row
corresponding to the earliest event.  So if I had, say, 1000 rows of 100
IDs doing an average of ten events each, I'd run this trimming procedure
and end up with a data frame containing 100 rows (one for each ID), where
each row record that ID's first event.

I can think of slow, clumsy, for-loop ways to trim the data frame, but I'm
hopeful that there is some slick "R" way to do it that someone here can
help me find.  But so deep is my ignorance that I can't even come up with
useful search terms to use on Rseek.org (I investigated "merge" but had no
luck there).


Grateful for any ideas/tips/pointers,
Dylan

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list