[R] correct my method of estimating mean of two POSIXlt data frames

Gabor Grothendieck ggrothendieck at myway.com
Wed Oct 6 06:51:32 CEST 2004


Prof Brian Ripley <ripley at stats.ox.ac.uk> writes:

: > > If a.lt and b.lt are the two vectors of POSIXlt dates then try
: > > converting each to POSIXct and unclassing to make each numeric.
: > > Take the mean of the two numeric vectors and convert them back to
: >
: > I see. I'm a little confused with the use of class/unclass versus
: > as.XX. For example, instead of using unclass, why wouldn't I use
: > as.numeric? Could someone explain the difference?
: 
: That advice is wrong: you should not be unclassing before forming the
: mean as mean() has a method for POSIXct.

You did not quote the code I posted but, actually, it gives
the correct answer.  For example, with the poster's data:

R> a.lt <- as.POSIXlt(c("2003-07-09 11:02:25", "2003-07-09 11:10:25", 
+                   "2003-07-09 11:30:25", "2003-07-09 12:00:25"))
R> b.lt <- as.POSIXlt(c("2003-07-09 11:02:35", "2003-07-09 11:10:35", 
+                   "2003-07-09 11:30:35", "2003-07-09 12:00:35"))

R> # code and the answer it gives:

R> a <- unclass(as.POSIXct(a.lt))
R> b <- unclass(as.POSIXct(b.lt))
R> as.POSIXlt(structure((a+b)/2, class = c("POSIXt", "POSIXct")))
[1] "2003-07-09 11:02:30 Eastern Daylight Time"
[2] "2003-07-09 11:10:30 Eastern Daylight Time"
[3] "2003-07-09 11:30:30 Eastern Daylight Time"
[4] "2003-07-09 12:00:30 Eastern Daylight Time"

Also, mean.POSIXlt calls mean.POSIXct which in turn unclasses its
argument so the data ultimately gets unclassed along the way one
way or another even if one uses mean.

Perhaps you are referring to the fact that unclass violates
encapsulation in which case I agree that as.numeric is 
better conceptually though I would add the caveat that if one
is really going to be sticky about encapsulation then R's
object model is probably not the one for you.

Anyways, it did occur to me, as an afterthought, that one could
avoid the structure part of the above solution, at the expense
of a slight obsfucation, by using the fact that for any two
real numbers (a+b)/2 = (a-b)/2 + b so:

R> a <- as.numeric(a.lt+0)
R> b <- as.numeric(b.lt+0)
R> (a-b)/2 + b.lt
[1] "2003-07-09 11:02:30 Eastern Daylight Time"
[2] "2003-07-09 11:10:30 Eastern Daylight Time"
[3] "2003-07-09 11:30:30 Eastern Daylight Time"
[4] "2003-07-09 12:00:30 Eastern Daylight Time"




More information about the R-help mailing list