[R] subset and as.POSIXct / as.POSIXlt oddness

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Thu Mar 24 15:52:47 CET 2011


On 03/24/2011 06:29 AM, Michael Bach wrote:
> Dear R users,
>
> Given this data:
>
> x<- seq(1,100,1)
> dx<- as.POSIXct(x*900, origin="2007-06-01 00:00:00")
> dfx<- data.frame(dx)
>
> Now to play around for example:
>
> subset(dfx, dx>  as.POSIXct("2007-06-01 16:00:00"))
>
> Ok. Now for some reason I want to extract the datapoints between hours
> 10:00:00 and 14:00:00, so I thought well:
>
> subset(dfx, dx>  as.POSIXct("2007-06-01 16:00:00"), 14>  as.POSIXlt(dx)$hour
> &  as.POSIXlt(dx)$hour<  10)
> Error in as.POSIXlt.numeric(dx) : 'origin' must be supplied
>
> Well that did not work. But why does the following work?
>
> 14>  as.POSIXlt(dx)$hour&  as.POSIXlt(dx)$hour<  10
>

It does work. Try it.

> Is there something I miss about subset()?

You have given three arguments to subset.  Your third argument is a poor 
choice for selecting columns. Try:

subset(dfx, dx>  as.POSIXct("2007-06-01 16:00:00")&  14>  as.POSIXlt(dx)$hour
&  as.POSIXlt(dx)$hour<  10)

or better yet,

tmp<- as.POSIXlt( dfx$dx )

subset(dfx, dx>  as.POSIXct("2007-06-01 16:00:00")&  14>  tmp$hour&  tmp$hour<  10)

since the as.POSIXlt is a rather heavyweight operation.


>   Or is there even another way of
> aggregating over an hourly time interval in a nicer way?

This is not aggregation.  This is selection. It is only when you summarize 
the selected data that you are aggregating.

Normally, the term aggregating is applied when you use a grouping column and 
collapse many values with the same characteristics into one value per set of 
characteristics.  For example using base functions,

dfx$interval <- cut(tmp$hour,c(-1,10,14,24))
aggregate(dfx$dx,list(Interval=dfx$interval),length)

or

aggregate(dfx$dx,list(Hour=tmp$hour),length)

but I find that the plyr library is much more user-friendly than aggregate.

> Best Regards,
> Michael Bach
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list