[R] summarizing daily time-series date by month

Gabor Grothendieck ggrothendieck at myway.com
Wed Jan 26 20:15:51 CET 2005


Benjamin M. Osborne <Benjamin.Osborne <at> uvm.edu> writes:

: 
: Message: 63
: Date: Wed, 26 Jan 2005 04:28:51 +0000 (UTC)
: From: Gabor Grothendieck <ggrothendieck <at> myway.com>
: Subject: Re: [R] chron: parsing dates into a data frame using a
:         forloop
: To: r-help <at> stat.math.ethz.ch
: Message-ID: <loom.20050126T052153-333 <at> post.gmane.org>
: Content-Type: text/plain; charset=us-ascii
: 
: Benjamin M. Osborne <Benjamin.Osborne <at> uvm.edu> writes:
: 
: :
: : I have one data frame with a column of dates and I want to fill another 
data
: : frame with one column of dates, one of years, one of months, one of a 
unique
: : combination of year and month, and one of days, but R seems to have some
: : problems with this.  My initial data frame looks like this (ignore the NAs 
in
: : the other fields):
: :
: : > mans[1:10,]
: :        date loc snow.new prcp tmin snow.dep tmax
: : 1  11/01/54   2       NA   NA   NA       NA   NA
: : 2  11/02/54   2       NA   NA   NA       NA   NA
: : 3  11/03/54   2       NA   NA   NA       NA   NA
: : 4  11/04/54   2       NA   NA   NA       NA   NA
: : 5  11/05/54   2       NA   NA   NA       NA   NA
: : 6  11/06/54   2       NA   NA   NA       NA   NA
: : 7  11/07/54   2       NA   NA   NA       NA   NA
: : 8  11/08/54   2       NA   NA   NA       NA   NA
: : 9  11/09/54   2       NA   NA   NA       NA   NA
: : 10 11/10/54   2       NA   NA   NA       NA   NA
: : >
: :
: : The code and resultant data frame look like this:
: :
: : > for(i in 1:10){
: : + mans.met$date[i]<-mans$date[i]
: : + mans.met$year[i]<-years(mans.met$date[i])
: : + mans.met$month[i]<-months(mans.met$date[i])
: : + mans.met$yearmo[i]<-cut(mans.met$date[i], "months")
: : + mans.met$day[i]<-days(mans.met$date[i])
: : + }
: : > mans.met[1:10,]
: :        date year month yearmo day snow.new snow.dep prcp tmin tmax tmean
: : 1  11/01/54    1    11      1   1       NA       NA   NA   NA   NA    NA
: : 2  11/02/54    1    11      1   2       NA       NA   NA   NA   NA    NA
: : 3  11/03/54    1    11      1   3       NA       NA   NA   NA   NA    NA
: : 4  11/04/54    1    11      1   4       NA       NA   NA   NA   NA    NA
: : 5  11/05/54    1    11      1   5       NA       NA   NA   NA   NA    NA
: : 6  11/06/54    1    11      1   6       NA       NA   NA   NA   NA    NA
: : 7  11/07/54    1    11      1   7       NA       NA   NA   NA   NA    NA
: : 8  11/08/54    1    11      1   8       NA       NA   NA   NA   NA    NA
: : 9  11/09/54    1    11      1   9       NA       NA   NA   NA   NA    NA
: : 10 11/10/54    1    11      1  10       NA       NA   NA   NA   NA    NA
: : >
: :
: : The problem seems to be with assigning within the forloop, or making the
: : assignment into a data frame, since:
: :
: : > years(mans.met$date[5])
: : [1] 1954
: : Levels: 1954
: : > test<-years(mans.met$date[5])
: : > test
: : [1] 1954
: : Levels: 1954
: : >
: : > months(mans.met$date[5])
: : [1] Nov
: : 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec
: : > test<-months(mans.met$date[5])
: : > test
: : [1] Nov
: : 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec
: : >
: : > cut(mans.met$date[3], "months")
: : [1] Nov 54
: : Levels: Nov 54
: : > test<-cut(mans.met$date[3], "months")
: : > test
: : [1] Nov 54
: : Levels: Nov 54
: : >
: : > days(mans.met$date[4])
: : [1] 4
: : 31 Levels: 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 10 < 11 < 12 < 13 < ... < 31
: : > test<-days(mans.met$date[4])
: : > test
: : [1] 4
: : 31 Levels: 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 10 < 11 < 12 < 13 < ... < 31
: : >
: :
: : Any suggestions will be appreciated.
: : -Ben Osborne
: 
: I guess you set up mans.met as numeric columns and when you
: assign your factors to numeric variables you get
: the underlying codes.  Note that if f is a factor then as.numeric(f)
: gives the codes underlying the factor whereas as.character(f) gives
: the labels.
: 
: It would be better not to use a loop at all.  I don't know whether you
: want or not want factors but at any rate here is something you could
: try.  It creates data frame df2 without a loop.
: 
: df2 <- data.frame(date = mans$date, yearmo = as.character(cut
(mans$date, "m")))
: df2 <- cbind(df2, month.day.year(mans$date))
: 
: Finally, do you really want this redundant representation?  I would tend to
: go with just storing the dates and computing any of the other quantities
: on-the-fly as needed.
: 
: ##########
: The reason for the redundancy is that I will want to summarize these 50 
years of
: daily time series data by month, so that records that share each unique year
: and month in the mans.met$yearmo column will be summed or averaged, etc. 
into a
: new row in another data frame(mans.monthly, having
: nrow=length(unique(mans.met$yearmo))).  The way I would do this is again 
using
: a forloop, but the loop won't recognize :
:      for (i in 1:(length(unique(mans.met$yearmo[i])))){

This seems circular.  You are defining i in terms of i.

: 
: What I really need to know is why I can call any ith of
:      unique(mans.met$yearmo[i])
: by itself, but not in a loop.
: 
: Or, perhaps there is an even easier way to extract the year and month from 
the
: date
: column on the fly to compute these summaries?

Look at ?aggregate, ?by and ?tapply.  e.g.

   aggregate(mans[,-1], list(cut(mans$date, "m")), mean)




More information about the R-help mailing list