[R] crosstabs and histograms with flexible binning of dates

davideps davideps at umich.edu
Wed Apr 4 20:00:02 CEST 2012


Hi,

First, thank you to Duncan Mackay for getting me started processing dates
with R. Unfortunately, I need to do a little more than I initially expected.
I have 5K lines of data that look like this:

ID     AREA       DATE
0001   Center     2010-10-15
0002   Center     2010-01-02
0003   NorthWest  2010-02-05
0004   SouthWest  2010-05-11

I would like to create a script to create crosstabs like the one below, but
that (1) could be used to easily create small multiples with lattice or
ggplot2 and (2) provides flexible binning options, such as monthly from a
specific day. Should I manually create the crosstab or can I use a histogram
function to generate it on the way to generating a graphic? 

AREA     1/2010-3/2010     4/2010-6/2010    7/2010-9/2010   10/2010-12/2010
Center         1                 0               0               1
NorthWest      1                 0               0               0
SouthWest      0                 1               0               0

Below is my code to handle the arbitrary bins, but I'm guessing there are
useful libraries and more elegant approaches. Any pointers would be
appreciated.

import(foreign)

# LOAD FILE 
#parcels=read.dbf()      #depending on source file
parcels=read.delim("~/Projects/GIS_DATA/Parcels_NSP_BlockGroup.txt")
attach(parcels)

# DEFINE BINNING
basedate=as.Date("2011/05/11")
currentdate=basedate
interval=3 #width of interval in months. 3 = quarterly
num_intervals=5 #how many intervals to include after basedate

for (i in c(1:num_intervals)) {
  startdate=currentdate
  enddate=seq(startdate,by="month",length=interval)[interval]  #create a
sequence of months of length "interval" and take last one.
  # crosstab construction of single column here
  # add column to final dataframe
  currentdate=enddate
  }


Thank you,
-david



More information about the R-help mailing list