[R] Posix Problem, difftime
jasont at indigoindustrial.co.nz
Thu Oct 17 09:22:04 CEST 2002
On Thu, Oct 17, 2002 at 11:51:24AM -0500, Matthew Pocernich wrote:
... [trouble with POSIX date/time classes] ...
> I assume most of my problems come from a mis-underdanding of the POSIX class. My matrix named (aa) for this year is approx 8700 by 4. When I try to calculate the length of posit column ( which is the date and time) I get
Judging by the different column types this looks more like a data frame than
a matrix - minor picky point, but computers are even pickier ;-).
> > length(aa$posit)
>  9
Correct. aa$posit is a POSIXlt structure, which is effectively a 9 element list.
Each list element is a vector as long as the number of rows in your data.frame.
Using the aa you've provided (data frame with 5 rows):
> length(aa$posit) #gives list length - POSIXlt objects have 9 list elements.
> length(as.character(aa$posit)) #convert to string dates, then get length
> lapply(aa$posit,length) #get length of every list member
> When I try to combined POSIX columns or manipulate them like matrices, I have problems such as
> > cbind(aa$posit, aa$posit)
> Error in cbind(...) : cannot create a matrix from these types
Yes. cbind works nicly with vectors of atomic data types. It gets a little hairy with
> Ultimately I would like to check to make certain the hourly data is not missing. ...
> Ultimately I used a for loop, which is very slow and it produces the following error.
As Larry Wall once said in answer to a perl question:
Q. Why is this so clumsy?
A. The trick is to use [the language's] strengths rather than its weaknesses.
-- Larry Wall in <8225 at jpl-devvax.JPL.NASA.GOV>
What's below works using R's wonderful indexing, with difftime. Note that the first
element of the list is more properly NA than 0; there is no "before-the-first"
observation, so time difference makes no sense for the first.
tide posit diffh y365
1 -1.25 1901-10-28 22:00:00 -2.7 301
2 -2.75 1901-10-28 23:00:00 -1.5 301
3 -2.25 1901-10-29 00:00:00 0.5 302
4 -0.25 1901-10-29 01:00:00 2.0 302
5 2.65 1901-10-29 02:00:00 2.9 302
> n.obs <- nrow(aa) #the number of rows - just a convenience thing.
> n.obs #check it.
> aa$difftime <- c(NA,difftime(aa$posit[2:n.obs],aa$posit[1:(n.obs-1)]))
tide posit diffh y365 difftime
1 -1.25 1901-10-28 22:00:00 -2.7 301 NA
2 -2.75 1901-10-28 23:00:00 -1.5 301 1
3 -2.25 1901-10-29 00:00:00 0.5 302 1
4 -0.25 1901-10-29 01:00:00 2.0 302 1
5 2.65 1901-10-29 02:00:00 2.9 302 1
# a quick check on which rows had difftime not equal to 1:
> aa[which(aa$difftime != 1),]
 tide posit diffh y365 difftime
<0 rows> (or 0-length row.names)
[side issue, to answer the question that's probably in your head now:
"why are dates so hard in R?"]
Dates and times are implemented very thorougly and very solidly
Dates and times themselves have a ludicrous, irrational, irregular
structure, and are probably the hardest sticking point in any piece
of programming. Anything that handles dates solidly and in generality
is going to be hard. Even worse, differnt operating systems and
programming environments all had their own, uh, "clever" solutions.
Integrating this, and making it work the same way no matter where R
is running is a very, very difficult thing.
Believe it or not, R makes dates and times much, much easier.
If it seems to have some bizarre corners, it's because dates and
times have some very very strange twists.
So, let me just thank Brian D. Ripley and Kurt Hornik once again
for the wonderful job they've done on the date/time classes.
Good work, guys.
Indigo Industrial Controls Ltd.
jasont at indigoindustrial.co.nz
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
More information about the R-help