[R] summing values by week - based on daily dates - but with some dates missing

Dimitri Liakhovitski dimitri.liakhovitski at gmail.com
Wed Mar 30 21:35:17 CEST 2011


Henrique, this is great, thank you!

It's almost what I was looking for! Only one small thing - it doesn't
"merge" the results for weeks that "straddle" 2 years. In my example -
last week of year 2008 and the very first week of 2009 are one week.
Any way to "join them"?
Asking because in reality I'll have many years and hundreds of groups
- hence, it'll be hard to do it manually.


BTW - does format(dates,"%Y.%W") always consider weeks as starting with Mondays?

Thank you very much!
Dimitri


On Wed, Mar 30, 2011 at 2:55 PM, Henrique Dallazuanna <wwwhsd at gmail.com> wrote:
> Try this:
>
> aggregate(value ~ group + format(dates, "%Y.%W"), myframe, FUN = sum)
>
>
> On Wed, Mar 30, 2011 at 11:23 AM, Dimitri Liakhovitski
> <dimitri.liakhovitski at gmail.com> wrote:
>> Dear everybody,
>>
>> I have the following challenge. I have a data set with 2 subgroups,
>> dates (days), and corresponding values (see example code below).
>> Within each subgroup: I need to aggregate (sum) the values by week -
>> for weeks that start on a Monday (for example, 2008-12-29 was a
>> Monday).
>> I find it difficult because I have missing dates in my data - so that
>> sometimes I don't even have the date for some Mondays. So, I can't
>> write a proper loop.
>> I want my output to look something like this:
>> group   dates   value
>> group.1 2008-12-29  3.0937
>> group.1 2009-01-05  3.8833
>> group.1 2009-01-12  1.362
>> ...
>> group.2 2008-12-29  2.250
>> group.2 2009-01-05  1.4057
>> group.2 2009-01-12  3.4411
>> ...
>>
>> Thanks a lot for your suggestions! The code is below:
>> Dimitri
>>
>> ### Creating example data set:
>> mydates<-rep(seq(as.Date("2008-12-29"), length = 43, by = "day"),2)
>> myfactor<-c(rep("group.1",43),rep("group.2",43))
>> set.seed(123)
>> myvalues<-runif(86,0,1)
>> myframe<-data.frame(dates=mydates,group=myfactor,value=myvalues)
>> (myframe)
>> dim(myframe)
>>
>> ## Removing same rows (dates) unsystematically:
>> set.seed(123)
>> removed.group1<-sample(1:43,size=11,replace=F)
>> set.seed(456)
>> removed.group2<-sample(44:86,size=11,replace=F)
>> to.remove<-c(removed.group1,removed.group2);length(to.remove)
>> to.remove<-to.remove[order(to.remove)]
>> myframe<-myframe[-to.remove,]
>> (myframe)
>>
>>
>>
>> --
>> Dimitri Liakhovitski
>> Ninah Consulting
>> www.ninah.com
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>



-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com



More information about the R-help mailing list