[R] Aggregate counts of observations with times surrounding a time?

Mark Noworolski jmarkn at gmail.com
Tue May 16 06:48:45 CEST 2017


I have a data frame that has a set of observed dwell times at a set of
locations. The metadata for the locations includes things that have varying
degrees of specificity. I'm interested in tracking the number of people
present at a given time in a given store, type of store, or zip code.

Here's an example of some sample data (here st=start_time, and et=end_time):
data.frame(st=seq(1483360938,by=1700,length=10),et=seq(1483362938,by=1700,length=10),store=c(rep("gap",5),rep("starbucks",5)),zip=c(94000,94000,94100,94100,94200,94000,94000,94100,94100,94200),store_id=seq(50,59))
           st         et     store   zip store_id
1  1483360938 1483362938       gap 94000       50
2  1483362638 1483364638       gap 94000       51
3  1483364338 1483366338       gap 94100       52
4  1483366038 1483368038       gap 94100       53
5  1483367738 1483369738       gap 94200       54
6  1483369438 1483371438 starbucks 94000       55
7  1483371138 1483373138 starbucks 94000       56
8  1483372838 1483374838 starbucks 94100       57
9  1483374538 1483376538 starbucks 94100       58
10 1483376238 1483378238 starbucks 94200       59

I'd like to be able to:
a) create aggretages of the number of people present in each store_id at a
given time
b) create aggregates of the number of people present - grouped by zip or
store

I expect to be rolling up to hour or half hour buckets, but I don't think I
should have to decide this up front and be able to do something clever to
be able to use ggplot + some other library to plot the time evolution of
this information, rolled up the way I want.

Any clever solutions? I've trolled stackoverflow and this email list.. to
no avail - but I'm willing to acknowledge I may have missed something.

	[[alternative HTML version deleted]]



More information about the R-help mailing list