[R] Using tapply to create a new table

Marc Schwartz marc_schwartz at comcast.net
Fri Jan 26 19:08:22 CET 2007


On Fri, 2007-01-26 at 12:39 -0500, Kalish, Josh wrote:
> All,
> 
> I'm sure that this is covered somewhere, but I can't seem to find a
> good explanation.  I have an existing table that contains information
> grouped by date.  This is as so:
> 
> Day		NumberOfCustomers		NumberOfComplaints
> 20060512	10040				40
> 20060513	32420				11
> ...
> 
> 
> I also have a table at the detail level as so:
> 
> Day		Meal		PricePaid	UsedCupon
> 20060512	Fish		14		Y
> 20060512	Chicken	20		N
> ...
> 
> Is there a simple way to create summaries on the detail table and then
> join them into the first table above so that it looks like this:
> 
> Day		NumberOfCustomers		NumberOfComplaints	AveragePricePaid
> NumberUsingCupon
> 
> 
> I can do a tapply to get what I want from the detail table, but I
> can't figure out how to turn that into a table and join it back in.
> 
> 
> 
> Thanks,
> 
> Josh 

Skipping the steps of using tapply() or aggregate() to get the
summarized data from the second data frame, you would then use merge()
to perform a SQL-like 'join' operation:

> DF.1
       Day NumberOfCustomers NumberOfComplaints
1 20060512             10040                 40
2 20060513             32420                 11

> DF.2
       Day    Meal PricePaid UsedCupon
1 20060512    Fish        14         Y
2 20060512 Chicken        20         N

> merge(DF.1, DF.2, by = "Day")
       Day NumberOfCustomers NumberOfComplaints    Meal PricePaid
1 20060512             10040                 40    Fish        14
2 20060512             10040                 40 Chicken        20
  UsedCupon
1         Y
2         N


By default, only rows matching on the 'by' argument in both data frames
will be in the result. See the 'all.x' and 'all.y' arguments to handle
other scenarios of including non-matching rows.

See ?merge, which BTW:

  help.search("join")

would point you to, if you are familiar with the term from relational
data base operations.

HTH,

Marc Schwartz



More information about the R-help mailing list