[R] subset data based on values in multiple columns

Petr PIKAL petr.pikal at precheza.cz
Tue Jul 3 09:12:03 CEST 2012


Hi

> 
> Dear list members,
> 
> I am trying to create a subset of a data frame based on conditions in 
two 
> columns, and after spending much time trying (and search R-help) have 
not 
> had any luck. Essentially, I have a data frame that is something like 
this:
> 
> date<-as.POSIXct(as.character(c
> 
("2012-01-25","2012-01-25","2012-01-26","2012-01-27","2012-01-27","2012-01-27")))
> time<-as.POSIXct(as.character(c("13:20", "13:40", "14:00", "10:00", "10:
> 20", "10:20")), format="%H:%M")
> count<-c(12,14,11,12,12,8)
> data<-data.frame(date,time,count)
> 
> which looks like:
> 
>     date         time      count
> 1 2012-01-25    13:20:00     12
> 2 2012-01-25    13:40:00     14
> 3 2012-01-26    14:00:00     11
> 4 2012-01-27    10:00:00     12
> 5 2012-01-27    10:20:00     12
> 6 2012-01-27    10:20:00      8
> 
> I would like to create a subset by doing the following: for each unique 
> date, only include one case which will be the case with the max value 
for 
> the column labelled "count". So the resulting subset would be:
> 
>     date         time      count
> 2 2012-01-25    13:40:00     14
> 3 2012-01-26    14:00:00     11
> 4 2012-01-27    10:00:00     12
> 
> Some dates have two cases at which the "count" was the same, but I only 
> want to include one case (I don't really mind which case it chooses, but 

> if need be it could be based on the earliest "time" for which the same 
> counts occurred). I have tried various loops with no success! I'm sure 
> that there is an easy answer that I have not found! Any help is much 
appreciated!!

Just few days ago similarquestion was asked (selecting rows by maximum 
value of one variables in dataframe nested by another Variable). Here is 
what was recommended.

do.call("rbind",lapply(split(data, data$date), function(x) 
x[which.max(x[,2]),]))

Regards
Petr


> 
> All the best,
> 
> Chandra
> 
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list