[R] identify data points by certain criteria

arun smartpink111 at yahoo.com
Thu Jun 13 22:15:24 CEST 2013


Hi,
May be this helps:
source("Ye_data.txt")
 dim(dat1)
#[1] 44640     3
library(xts)
 xt1<- xts(dat1[,-1],strptime(dat1[,1],"%m/%d/%Y %H:%M"))
xtSub<-xt1["T00:00:00/T08:00:00"]
 dim(xt1)
#[1] 44640     2
 dim(xtSub)
#[1] 14911     2
lst1<-split(xtSub,as.Date(index(xtSub)))

sapply(lst1,function(x) {indx<- which(rowSums(x)==0);indx1<-which.max(c(1,diff(index(x)[indx]))) })
#2012-12-01 2012-12-02 2012-12-03 2012-12-04 2012-12-05 2012-12-06 2012-12-07 
#       373         41        262        268        266        254        274 
#2012-12-08 2012-12-09 2012-12-10 2012-12-11 2012-12-12 2012-12-13 2012-12-14 
#       109          1        323        264        279        353        265 
#2012-12-15 2012-12-16 2012-12-17 2012-12-18 2012-12-19 2012-12-20 2012-12-21 
#       327        226        264        269        271        267        276 
#2012-12-22 2012-12-23 2012-12-24 2012-12-25 2012-12-26 2012-12-27 2012-12-28 
#       360        162        222         81        231        143        364 
#2012-12-29 2012-12-30 2012-12-31 
 #      122        399        418 
 lst2<-lapply(lst1,function(x) {indx<- which(rowSums(x)==0);indx1<-which.max(c(1,diff(index(x)[indx])));index(x)[indx1] })

lst2[1:3]
#$`2012-12-01`
#[1] "2012-12-01 06:12:00 EST"
#
#$`2012-12-02`
#[1] "2012-12-02 00:40:00 EST"
#
#$`2012-12-03`
#[1] "2012-12-03 04:21:00 EST"
A.K.






________________________________
From: Ye Lin <yelin at lbl.gov>
To: arun <smartpink111 at yahoo.com> 
Sent: Thursday, June 13, 2013 1:11 PM
Subject: Re: [R] identify data points by certain criteria



hey Arun,

Sorry about the confusion. My intention to apply a simple sample is to simply the question and I can self-educate/modify on the code you provided and apply to my real data.

Here is how my real data looks like. It is 1 min data for entire month. I will focus on the time period from 0:00-08:00 everyday ( from midnight to 8am) and try to find out the timestamp meets the criteria I mentioned before. 

Thanks for your help!

Ye



On Thu, Jun 13, 2013 at 9:57 AM, arun <smartpink111 at yahoo.com> wrote:


>
>
>HI Ye,
>Could you provide an example that mimic your real dataset?  Because if I spend some time on this and it is not the case, then it is a waste of time.
>
>
>
>
>________________________________
>From: Ye Lin <yelin at lbl.gov>
>To: arun <smartpink111 at yahoo.com>
>Sent: Thursday, June 13, 2013 12:54 PM
>
>Subject: Re: [R] identify data points by certain criteria
>
>
>
>oh~sorry~
>
>its gonna be from 00:00-23:59 ~ 1 day range
>
>
>
>On Thu, Jun 13, 2013 at 9:42 AM, arun <smartpink111 at yahoo.com> wrote:
>
>
>>
>>Hi,
>>
>>I was talking about the timestamp itself.  I don't know the range of your timestamp.
>>
>>
>>indx[which.max(c(1,diff(as.numeric(gsub(".*:","",dat1[,1][indx])))))]
>>#[1] 10
>>
>> dat1[indx[which.max(c(1,diff(as.numeric(gsub(".*:","",dat1[,1][indx])))))],]
>>#    Time Var1 Var2
>>#10 00:09    0    0
>>A.K.
>>
>>________________________________
>>From: Ye Lin <yelin at lbl.gov>
>>To: arun <smartpink111 at yahoo.com>
>>Sent: Thursday, June 13, 2013 12:00 PM
>>Subject: Re: [R] identify data points by certain criteria
>>
>>
>>
>>
>>Basically what I am trying to do is to find out the first timestamp that meets the criteria, in other words "when does it happen"
>>
>>
>>
>>On Wed, Jun 12, 2013 at 6:29 PM, arun <smartpink111 at yahoo.com> wrote:
>>
>>Hi,
>>>Not clear about the 'Time' column.
>>>dat1<- read.table(text="
>>>
>>>Time    Var1      Var2
>>>00:00    1              0
>>>00:01    0              0
>>>00:02    1              0
>>>00:03    1              0
>>>00:04    0              0
>>>00:05    1              0
>>>00:06    1              0
>>>00:07    1              0
>>>00:08    1              0
>>>00:09    0              0
>>>00:10    1              0
>>>00:11    1              0
>>>00:12    1              0
>>>00:13    0              0
>>>",sep="",header=TRUE,stringsAsFactors=FALSE)
>>>
>>>
>>>indx<-which(rowSums(dat1[,-1])==0)
>>>dat1[indx[which.max(c(1,diff(as.numeric(gsub(".*:","",dat1[,1][indx])))))],]
>>>#    Time Var1 Var2
>>>#10 00:09    0    0
>>>dat1[indx[which.max(c(1,diff(as.numeric(gsub(".*:","",dat1[,1][indx])))))],"Time"]
>>>#[1] "00:09"
>>>
>>>
>>>A.K.
>>>
>>>
>>>
>>>
>>>----- Original Message -----
>>>From: Ye Lin <yelin at lbl.gov>
>>>To: R help <r-help at r-project.org>
>>>Cc:
>>>Sent: Wednesday, June 12, 2013 8:55 PM
>>>Subject: [R] identify data points by certain criteria
>>>
>>>Hey I want to identify data points by criteria, here is an example of my
>>>1min data
>>>
>>>Time     Var1      Var2
>>>00:00    1              0
>>>00:01    0              0
>>>00:02    1              0
>>>00:03    1              0
>>>00:04    0              0
>>>00:05    1              0
>>>00:06    1              0
>>>00:07    1              0
>>>00:08    1              0
>>>00:09    0              0
>>>00:10    1              0
>>>00:11    1              0
>>>00:12    1              0
>>>00:13    0              0
>>>
>>>I want to identify the data points where Var1=0 and Var2=0, ( in this
>>>example shud be the points highlighted above), then calculate the time
>>>duration between these data points, (in this example, shud be 3min, 5 min
>>>and 4min), then identify the starting point of the max time duration ( in
>>>this example shud be the starting point of 5-min-duration, return the data
>>>points at 00:09), finally return the value in "Time" column ( in this
>>>example shud be "00:09")
>>>
>>>Thanks for your help!
>>>
>>>
>>>    [[alternative HTML version deleted]]
>>>
>>>______________________________________________
>>>R-help at r-project.org mailing list
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>



More information about the R-help mailing list