[R] Need help subsetting time series data

Nathaniel nathanielrayl at Hotmail.com
Wed Feb 2 15:04:03 CET 2011


Hi all,

I have multiple datasets of time series data taken from GPS collars.  The
collars are supposed to take a fix every hour on the half hour, i.e., 0:30,
1:30, 2:30...23:30, (because it sometimes takes longer for the collars to
acquire a location the minute of these locations vary from 30-34) but
because of a software glitch in the collars, at random times the collars
start taking multiple fixes between programmed fixes, i.e., 22:31, 23:31,
0:31, 1:26, 1:29, 1:30, 1:31, 1:32, 1:33, 1:35, 1:35, 1:35, 1:35, 1:36,
1:36, 1:36, 2:30.  These glitches occur approximately once a day throughout
the 24 hour and 60 minute cycle.  I want to remove all these extra locations
from my dataset, but am new to R and haven't figured out a way to do so. 
I've tried some inelegant solutions involving verbose code, but haven't been
able to come up with something that works correctly.

Some things I've tried:

#Subsetting out by minute value:
>MR1001=read.csv(etc)
>datetime<-paste(MR1001$date,MR1001$time)
>datetime<-as.POSIXlt(strptime(as.character(datetime), tz="UTC", "%m/%d%Y
%H:%M:%S"))
>MR1001$min<-datetime$min
>t1<-subset(MR1001,min==30|min==31|min==32|min==33|min==34)

This works for most of the data, but when the unwanted fixes occur during
the 30-34 minute mark of an hour (see example above) they are kept, which I
don't want.  To deal with this I tried to incorporate the time between fixes
in an attempt to write an "ifelse" statement and subset the data that way:

>MR1001=read.csv(etc)
>MR1001=read.csv(etc)
>datetime<-paste(MR1001$date,MR1001$time)
>datetime<-as.POSIXlt(strptime(as.character(datetime), tz="UTC", "%m/%d%Y
%H:%M:%S"))
>MR1001$min<-datetime$min
>t1<-subset(MR1001,min==30|min==31|min==32|min==33|min==34)
>datetime<-paste(MR1001$date,MR1001$time)
>datetime<-as.POSIXct(strptime(as.character(datetime), tz="UTC", "%m/%d%Y
%H:%M:%S"))
>datetime2<-datetime[-1]
>datetime2[length(datetime)]<-datetime2[length(datetime)-1]+3600
>datetime3<-datetime2-datetime
>datetime4<-datetime3/60
>datetime5<-as.numeric(datetime4)
>t1$diff<-datetime5

This didn't work either though, because when an unwanted fix occurred after
a wanted fix the value in the "diff" column was small, and I couldn't figure
out how to subset the data in that format (I want to keep the 1st, 2nd, 3rd,
and last fix in the example columns below (23:31, 00:31, 01:30, 2:30)):

time        diff
23:31     60
00:31     58.78
01:30     1.07
01:31     1.07
01:32     1.07
01:33     1.07
01:34     1.08
2:30       60

I hope this explanation is clear and that someone with more experience than
me can help with a solution.

Thanks very much in advance for your time and help!

Nathaniel
-- 
View this message in context: http://r.789695.n4.nabble.com/Need-help-subsetting-time-series-data-tp3254236p3254236.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list