[R] summing values by week - based on daily dates - but with somedates missing

Dimitri Liakhovitski dimitri.liakhovitski at gmail.com
Thu Mar 31 15:29:17 CEST 2011


Thank you so much, everyone, for your help.
Extremely valuable suggestions and extremely valuable learnings!
Dimitri

On Thu, Mar 31, 2011 at 5:03 AM, Martyn Byng <Martyn.Byng at nag.co.uk> wrote:
> Hi,
>
> Yep, that was what it was doing. For a sum across week, try something like
>
> get.week.flag <- function(dd) {
>  ## get weekday from the date in dd and code it as Monday = 1, Tuesday = 2 etc
>  idd = factor(weekdays(dd),levels=c("Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Sunday"))
>
>  ## convert to numeric
>  ndd = as.numeric(idd)
>
>  ## flag entries where weekday code gets less (this will flag changes in week)
>  wflag = c(FALSE,(ndd[-length(idd)] > ndd[-1]))
>
>  ## cumulative sum to get the week flag
>  cumsum(wflag) + 1
> }
>
> to get a week flag (this is assuming that your data is sorted by date, if not you'll have to sort it first). If you want the week to start on a different day, just change the ordering of the weekdays in the levels statement.
>
> data.frame(date=myframe$date,day=weekdays(myframe$date),week=get.week.flag(myframe$date))
>
> seems to indicate that the function is doing what it should, so you can then amend the previous code to use get.week.flag instead of weekdays, as in
>
> sum.by.week <- function(ff) {
>  by.day <- split(ff$value,get.week.flag(ff$dates))
>  lapply(by.day,sum)
> }
>
> by.grp <- split(myframe,myframe$group)
> lapply(by.grp,sum.by.week)
>
> Martyn
>
> -----Original Message-----
> From: Dimitri Liakhovitski [mailto:dimitri.liakhovitski at gmail.com]
> Sent: 30 March 2011 18:03
> To: Martyn Byng
> Cc: r-help
> Subject: Re: [R] summing values by week - based on daily dates - but with somedates missing
>
> Thank you, Martyn.
> But it looks like this way we are getting sums by day - i.e., across
> all Mondays, all Tuesdays, etc.
> Maybe I did not explain well, sorry! The desired output would contain
> sums for each WHOLE week - across all days that comprise that week -
> Monday through Sunday.
> Makes sense?
> Dimitri
>
> On Wed, Mar 30, 2011 at 12:53 PM, Martyn Byng <Martyn.Byng at nag.co.uk> wrote:
>> Hi,
>>
>> How about something like:
>>
>> sum.by.day <- function(ff) {
>>  by.day <- split(ff$value,weekdays(ff$dates))
>>  lapply(by.day,sum)
>> }
>>
>> by.grp <- split(myframe,myframe$group)
>>
>> lapply(by.grp,sum.by.day)
>>
>>
>> Martyn
>>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>> On Behalf Of Dimitri Liakhovitski
>> Sent: 30 March 2011 15:23
>> To: r-help
>> Subject: [R] summing values by week - based on daily dates - but with
>> somedates missing
>>
>> Dear everybody,
>>
>> I have the following challenge. I have a data set with 2 subgroups,
>> dates (days), and corresponding values (see example code below).
>> Within each subgroup: I need to aggregate (sum) the values by week -
>> for weeks that start on a Monday (for example, 2008-12-29 was a
>> Monday).
>> I find it difficult because I have missing dates in my data - so that
>> sometimes I don't even have the date for some Mondays. So, I can't
>> write a proper loop.
>> I want my output to look something like this:
>> group   dates   value
>> group.1 2008-12-29  3.0937
>> group.1 2009-01-05  3.8833
>> group.1 2009-01-12  1.362
>> ...
>> group.2 2008-12-29  2.250
>> group.2 2009-01-05  1.4057
>> group.2 2009-01-12  3.4411
>> ...
>>
>> Thanks a lot for your suggestions! The code is below:
>> Dimitri
>>
>> ### Creating example data set:
>> mydates<-rep(seq(as.Date("2008-12-29"), length = 43, by = "day"),2)
>> myfactor<-c(rep("group.1",43),rep("group.2",43))
>> set.seed(123)
>> myvalues<-runif(86,0,1)
>> myframe<-data.frame(dates=mydates,group=myfactor,value=myvalues)
>> (myframe)
>> dim(myframe)
>>
>> ## Removing same rows (dates) unsystematically:
>> set.seed(123)
>> removed.group1<-sample(1:43,size=11,replace=F)
>> set.seed(456)
>> removed.group2<-sample(44:86,size=11,replace=F)
>> to.remove<-c(removed.group1,removed.group2);length(to.remove)
>> to.remove<-to.remove[order(to.remove)]
>> myframe<-myframe[-to.remove,]
>> (myframe)
>>
>>
>>
>> --
>> Dimitri Liakhovitski
>> Ninah Consulting
>> www.ninah.com
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ________________________________________________________________________
>> This e-mail has been scanned for all viruses by Star.
>> ________________________________________________________________________
>>
>> ________________________________________________________________________
>> The Numerical Algorithms Group Ltd is a company registered in England
>> and Wales with company number 1249803. The registered office is:
>> Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
>>
>> This e-mail has been scanned for all viruses by Star. The service is
>> powered by MessageLabs.
>> ________________________________________________________________________
>>
>
>
>
> --
> Dimitri Liakhovitski
> Ninah Consulting
> www.ninah.com
>
> ________________________________________________________________________
> This e-mail has been scanned for all viruses by Star.
> ________________________________________________________________________
>
> ________________________________________________________________________
> The Numerical Algorithms Group Ltd is a company registered in England
> and Wales with company number 1249803. The registered office is:
> Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
>
> This e-mail has been scanned for all viruses by Star. The service is
> powered by MessageLabs.
> ________________________________________________________________________
>



-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com



More information about the R-help mailing list