[R] Getting Annual (Conditional) Averages

Gabor Grothendieck ggrothendieck at gmail.com
Sat Nov 17 15:19:42 CET 2007


On Nov 17, 2007 8:55 AM, Emmanuel Charpentier
<charpent at bacbuc.dyndns.org> wrote:
> Dear Lucia,
>
> lucia a écrit :
> > Hello,
> > I'm very new to R, and so my question is simple.
> >
> > I have data record with 80 years of daily temperatures in one long
> > string.  The dates are also recorded, in YYMMDD format.  I'd like to
> > learn an elegant simple way to pull out the annual averages.
> > (Obviously, every 4th year has 366 days.)
> >
> > I know I can set up a formal loop to create annual records and then
> > average. But R seems to have such neat methods, is there some better
> > way to do this?
>
> For sake of simplicity, let's say you managed to store your data in a
> two-column dataframe df, with columns date and temperature.
>
> The first step is to know how to extract the "year" part of the dates.
> The obvious solution is of course as.numeric(substr(date,1,2)), but I'd
> rather transform your date variable in genuine R's Date class variable,
> by as.Date(date,format="%y%m%d") or as.POSIXlt(date,format="%y%m%d"),
> the latter allowing easy year extraction by reading the "year" component.
>
> The second step is of course to apply to each of your relevant subtables
> the "mean" function ; that's what tapply() is meant for.
>
> So, a one-liner for your proble might be :
>
> Means<- tapply(df$temperature, as.POSIXlt(df$date,format="%y%m%d")$year,
> FUN=mean, na.rm=TRUE)
>
> However, this is only a very crude way to work with time series. I'd
> consider converting permanently your date variable in a suitable
> datetime representation. There are more than one way to do it : Julian
> dates, chron objects and, more recently introduced, DateTime classes,
> which seems to be "the standard way" to represent dates and times.
> Unfortunately, different packages teeem to expect different
> representations... <Sigh>.

That is not true of the zoo package. It does not assume a specific
date/time class.  It supports any date/time class that satisfies the
criteria discussed in ?zoo .   For example, numbers, Date, chron and
POSIXct all satisfy those criteria.



More information about the R-help mailing list