[R] Converting monthly data to quarterly data

Gabor Grothendieck ggrothendieck at gmail.com
Tue Aug 19 13:35:31 CEST 2008


Another possibility is to use the yearmon and yearqtr classes in the
zoo package.  Assuming quarters end in Feb, May, Aug and Nov
we have (using dat from the prior post):

> library(zoo)
> z <- zoo(dat$price, dat$date)
> aggregate(z, as.yearqtr(as.yearmon(time(z)) + 1/12), mean)
 2008 Q1  2008 Q2  2008 Q3  2008 Q4  2009 Q1  2009 Q2
 5.19913 21.86151 26.06295 29.68677 41.92547 48.10799

or to use calendar quarters its just:

aggregate(z, as.yearqtr, mean)

In the zoo package see ?zoo, ?aggregate.zoo, ?yearqtr, ?yearmon
and the three zoo vignettes.

On Mon, Aug 18, 2008 at 2:38 PM, Gavin <gavin.simpson at ucl.ac.uk> wrote:
> On Mon, 2008-08-18 at 14:31 +0100, Denise Xifara wrote:
>> Thank you very much Stephen, but how will aggregate deal with months that
>> fall outside annual quarters? eg, one extra month at the end of the dataset?
>
> [Without your data I'm kind of guessing at the exact format and problem,
> but the example below shows one way to deal with quarters that span
> years so should be adaptable to your problem. You also have a different
> idea of quarters to my ecological one...]
>
> ## first some dummy data
> set.seed(12345)
> dat <- data.frame(price = cumsum(rnorm(450)),
>                  date = seq(from = as.Date("2008-01-01"),
>                             length = 450, by = "days"))
>
> ## get the months of observations
> dat$month <- factor(format(dat$date, format = "%b"), levels = month.abb)
>
> ## and then format this for the quarters
> dat$quarter <- character(length = NROW(dat))
> ## I'm sure these 4 steps can be simplified but just
> ## how escapes me at the moment
> dat$quarter[dat$month %in% month.abb[c(12,1,2)]] <- "Winter"
> dat$quarter[dat$month %in% month.abb[c(3:5)]] <- "Spring"
> dat$quarter[dat$month %in% month.abb[c(6:8)]] <- "Summer"
> dat$quarter[dat$month %in% month.abb[c(9:11)]] <- "Autumn"
> dat$quarter <- factor(dat$quarter,
>                      levels = c("Spring","Summer","Autumn","Winter"))
>
> ## look at the fruits of our labour
> head(dat)
>
> ## create period
> runs <- rle(as.numeric(dat$quarter))$lengths
> dat$period <- factor(rep(seq_along(runs), times = runs))
>
> ## aggregate
> with(dat, aggregate(price, list(quarter = quarter, period = period),
>                    FUN = mean))
>
> I use the rle() function (run length encoding) calculate the number of
> observations where the 'quarter' remains the same:
>
>> rle(as.numeric(dat$quarter))
> Run Length Encoding
>  lengths: int [1:6] 60 92 92 91 90 25
>  values : num [1:6] 4 1 2 3 4 1
>
> The 'values' here are the numeric representation of the quarter factor.
> The most interesting for us is the second 4 - this is the winter 2008/9.
> I use the lengths to replicate a period number (1,2,...,n) the correct
> number of times. Now we have the period correctly calculated, we just
> aggregate by quarter and period to give the averages you want.
>
> If you are working on months 1-3 as quarter 1, 4-6 as quarter 2 etc,
> then it is much easier, just aggregate by quarter and year:
>
> ## copy the data above
> dat2 <- dat
> ## change meaning of quarter
> dat2$quarter <- character(length = NROW(dat2))
> dat2$quarter[dat2$month %in% month.abb[c(1:3)]] <- "Q1"
> dat2$quarter[dat2$month %in% month.abb[c(4:6)]] <- "Q2"
> dat2$quarter[dat2$month %in% month.abb[c(7:9)]] <- "Q3"
> dat2$quarter[dat2$month %in% month.abb[c(10:12)]] <- "Q4"
> dat2$quarter <- factor(dat2$quarter, levels = c("Q1","Q2","Q3","Q4"))
> ## year variable
> dat2$year <- factor(format(dat2$date, format = "%Y"))
>
> ## drop the first 40 days to simulate a late starting record
> ## and aggregate
> with(dat2[-(1:40), ], aggregate(price, list(quarter = quarter, year =
> year), FUN = mean))
>
> Which gives:
>  quarter year        x
> 1      Q1 2008 13.58644
> 2      Q2 2008 24.16523
> 3      Q3 2008 28.56004
> 4      Q4 2008 32.60900
> 5      Q1 2009 44.86594
>
> Do these examples help solve your problem?
>
> G
>>
>> 2008/8/18 stephen sefick <ssefick at gmail.com>
>>
>> > ?aggregate
>> > may do what you want
>> >
>> > On Mon, Aug 18, 2008 at 8:19 AM, Denise Xifara
>> > <dionysia-kiara.xifaras at st-hildas.ox.ac.uk> wrote:
>> > > Dear R users,
>> > >
>> > > I have a dataframe where column is has countries, column 2 is dates
>> > > (monthly) for each countrly, the next 10 columns are my factors where I
>> > have
>> > > measurements for each country and  for each date.  I have attached a
>> > sample
>> > > of the data in csv format with the data for 3 countries.
>> > >
>> > > I would like to convert my monthly data into quarterly data, finding the
>> > > mean over 3 month periods for factors a-i, and the sum for factor j.  My
>> > > problem is that not all countries have starting date at the beginning of
>> > a
>> > > quarter for a particular year, ie some countries start in May or
>> > September,
>> > > and also some countries have one extra month, some have two extra months
>> > so
>> > > there's no way of deleting some rows with a simple command (I want to get
>> > > rid of all extra data that does not fall into the quarters for each
>> > > country), since the amount of data to get rid of for each country varies.
>> > >
>> > > I tried for example:
>> > > i=1
>> > > denise<-data[((data$country)==unique(data$country[i]),]
>> > > denise[,2]<- as.Date(denise$date, "%Y-%m-%d")
>> > > denise2<-denise[order(denise[,2],decreasing=FALSE),]
>> > > len<-length(denise[,1])
>> > > limit<-floor(len/3)+1
>> > > splitter<-rep(1:limit,each=3)
>> > > spl.dat<-split(denise2,splitter)
>> > > new.data<-as.matrix(lapply(spl.dat,FUN="mean"))
>> > >
>> > > This finds the mean every 3 rows but this doesnt consider the data
>> > quarterly
>> > > in a calendar sense.  ie if the data starts in november, it doesnt
>> > discard
>> > > the data for november, december and start calculating the means from
>> > january
>> > > onwards, until the month where the last quarter finishes, discarding any
>> > > extra month, or two months at the end.
>> > >
>> > > I tried converting my data frame/matrix to a time series but the dates
>> > are
>> > > not kept.  I got:
>> > >
>> > >>tser<-as.ts(denise)
>> > > Warning message:
>> > > In data.matrix(data) : class information lost from one or more columns
>> > >
>> > > and column 2 has become a list of numbers rather than dates.
>> > >
>> > > I tried:
>> > >
>> > >> library(fCalendar)
>> > >> den.tseries<-as.timeSeries(denise)
>> > > Warning messages:
>> > > 1: In .whichFormat(charvec, ...) : Could not determine time(date) format
>> > > 2: In .whichFormat(charvec, ...) : Could not determine time(date) format
>> > >> is.timeSeries(den.tseries)
>> > > [1] TRUE
>> > >> apply.quarterly(den.tseries,FUN="mean")
>> > >                   data
>> > > 1970-01-01 -2.425000000
>> > > 1970-04-01 -0.557961111
>> > > 1970-04-28  0.009814815
>> > >
>> > > Here, it calculates things quarterly but the the as.timeSeries command
>> > has
>> > > assigned its own daily dates to the data, instead of keeping my monthly
>> > > dates.  Also, I don't understand how it deals with the extra dates.
>> > >
>> > > Sorry for the long email,
>> > > Any help would be very much appreciated,
>> > > Kind regards,
>> > > Denise
>> > >
>> > > ______________________________________________
>> > > R-help at r-project.org mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> > > and provide commented, minimal, self-contained, reproducible code.
>> > >
>> > >
>> >
>> >
>> >
>> > --
>> > Let's not spend our time and resources thinking about things that are
>> > so little or so large that all they really do for us is puff us up and
>> > make us feel like gods. We are mammals, and have not exhausted the
>> > annoying little problems of being mammals.
>> >
>> >        -K. Mullis
>> >
>>
>>       [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list