[R] Reshape or Plyr?

Law, Jason Jason.Law at portlandoregon.gov
Mon Apr 22 18:35:47 CEST 2013


Hi Bruce,

I work with a lot of similar data and have to do these types of things quite often.  I find it helps to keep to vectorized code as much as possible.  That is, do as many of the calculations as possible outside of the aggregation code.  Here's one way:

library(reshape2)
# stick to a variable naming convention and you'll avoid a lot of simple code errors
names(d)     <- gsub('_', '.', tolower(names(d)), fixed = T)
dm           <- melt(d, measure.var = c('ai', 'survey.time'))
results      <- dcast(dm, location.name + spec.code ~ variable, fun.aggregate = sum)
results$ra <- results$ai / results$survey.time * 10

The output:

       location.name spec.code ai survey.time         ra
1  079-f2p1-Acetuna      Buzz   8        72.8  1.0989011
2  079-f2p1-Acetuna    Eumspp   5        24.3  2.0576132
3  079-f2p1-Acetuna      Frag  18        12.1 14.8760331
4  079-f2p1-Acetuna    Molmol   1        12.1  0.8264463
5  079-f2p1-Acetuna    Molspp  28        72.8  3.8461538
6  079-f2p1-Acetuna    Myokea   1        12.2  0.8196721
7  079-f2p1-Acetuna    Nocalb  10        24.3  4.1152263
8  079-f2p1-Acetuna    Phyllo   4        36.4  1.0989011
9  079-f2p1-Acetuna    Ptedav   3        36.4  0.8241758
10 079-f2p1-Acetuna    Ptegym   6        36.4  1.6483516
11 079-f2p1-Acetuna    Ptepar   9        36.4  2.4725275
12 079-f2p1-Acetuna    Pteper   4        24.3  1.6460905
13 079-f2p1-Acetuna    Rhotum  30        36.4  8.2417582
14 079-f2p1-Acetuna    Sacbil  11        36.4  3.0219780
15 079-f2p1-Acetuna    Saclep  32        36.4  8.7912088

For a simple aggregation like this, reshape is simple and fast.  I tend to use plyr when things get more complicated.

Jason Law
Statistician
City of Portland
Bureau of Environmental Services
Water Pollution Control Laboratory
6543 N Burlington Avenue
Portland, OR 97203-5452
503-823-1038
jason.law at portlandoregon.gov

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Bruce Miller
Sent: Saturday, April 20, 2013 6:55 AM
To: r-help at r-project.org
Subject: [R] Reshape or Plyr?

H all,

I have relative abundance data from >100 sites.  This is from acoustic monitoring and usually the data is for 2-3 nights but in some cases my be longer like months or years for each location..
The data output from my management data base is proved by species by night for each location so data frame would look like this below. What I need to do is sum the Survey_time by Spec_Code for each location name and divide summed AI values for each Spec_code by the summed Survey time to adjust for unit effort then standardize it all by *10 to represent the relative abundance by survey hour to 10 hours. How best to do this?  
Using Plyr or reshape?

Location name 	SPEC_CODE 	Start_Day 	Survey_Time 	AI 	Std AI
079-f2p1-Acetuna 	Buzz 	2/14/2012 	12.1 	1 	0.8264463
079-f2p1-Acetuna 	Buzz 	2/14/2012 	12.1 	1 	0.8264463
079-f2p1-Acetuna 	Eumspp 	2/14/2012 	12.1 	1 	0.8264463
079-f2p1-Acetuna 	Frag 	2/14/2012 	12.1 	18 	14.87603
079-f2p1-Acetuna 	Molspp 	2/14/2012 	12.1 	5 	4.132231
079-f2p1-Acetuna 	Molspp 	2/14/2012 	12.1 	5 	4.132231
079-f2p1-Acetuna 	Phyllo 	2/14/2012 	12.1 	2 	1.652893
079-f2p1-Acetuna 	Ptedav 	2/14/2012 	12.1 	1 	0.8264463
079-f2p1-Acetuna 	Ptegym 	2/14/2012 	12.1 	1 	0.8264463
079-f2p1-Acetuna 	Ptepar 	2/14/2012 	12.1 	2 	1.652893
079-f2p1-Acetuna 	Rhotum 	2/14/2012 	12.1 	6 	4.958678
079-f2p1-Acetuna 	Sacbil 	2/14/2012 	12.1 	6 	4.958678
079-f2p1-Acetuna 	Saclep 	2/14/2012 	12.1 	11 	9.090909
079-f2p1-Acetuna 	Buzz 	2/15/2012 	12.1 	2 	1.652893
079-f2p1-Acetuna 	Buzz 	2/15/2012 	12.1 	2 	1.652893
079-f2p1-Acetuna 	Molmol 	2/15/2012 	12.1 	1 	0.8264463
079-f2p1-Acetuna 	Molspp 	2/15/2012 	12.1 	7 	5.785124
079-f2p1-Acetuna 	Molspp 	2/15/2012 	12.1 	7 	5.785124
079-f2p1-Acetuna 	Nocalb 	2/15/2012 	12.1 	6 	4.958678
079-f2p1-Acetuna 	Phyllo 	2/15/2012 	12.1 	1 	0.8264463
079-f2p1-Acetuna 	Ptedav 	2/15/2012 	12.1 	1 	0.8264463
079-f2p1-Acetuna 	Ptegym 	2/15/2012 	12.1 	4 	3.305785
079-f2p1-Acetuna 	Ptepar 	2/15/2012 	12.1 	4 	3.305785
079-f2p1-Acetuna 	Pteper 	2/15/2012 	12.1 	3 	2.479339
079-f2p1-Acetuna 	Rhotum 	2/15/2012 	12.1 	7 	5.785124
079-f2p1-Acetuna 	Sacbil 	2/15/2012 	12.1 	2 	1.652893
079-f2p1-Acetuna 	Saclep 	2/15/2012 	12.1 	6 	4.958678
079-f2p1-Acetuna 	Buzz 	2/16/2012 	12.2 	1 	0.8196721
079-f2p1-Acetuna 	Buzz 	2/16/2012 	12.2 	1 	0.8196721
079-f2p1-Acetuna 	Eumspp 	2/16/2012 	12.2 	4 	3.278688
079-f2p1-Acetuna 	Molspp 	2/16/2012 	12.2 	2 	1.639344
079-f2p1-Acetuna 	Molspp 	2/16/2012 	12.2 	2 	1.639344
079-f2p1-Acetuna 	Myokea 	2/16/2012 	12.2 	1 	0.8196721
079-f2p1-Acetuna 	Nocalb 	2/16/2012 	12.2 	4 	3.278688
079-f2p1-Acetuna 	Phyllo 	2/16/2012 	12.2 	1 	0.8196721
079-f2p1-Acetuna 	Ptedav 	2/16/2012 	12.2 	1 	0.8196721
079-f2p1-Acetuna 	Ptegym 	2/16/2012 	12.2 	1 	0.8196721
079-f2p1-Acetuna 	Ptepar 	2/16/2012 	12.2 	3 	2.459016
079-f2p1-Acetuna 	Pteper 	2/16/2012 	12.2 	1 	0.8196721
079-f2p1-Acetuna 	Rhotum 	2/16/2012 	12.2 	17 	13.93443
079-f2p1-Acetuna 	Sacbil 	2/16/2012 	12.2 	3 	2.459016
079-f2p1-Acetuna 	Saclep 	2/16/2012 	12.2 	15 	12.29508


Thanks for any suggestions.  Excel will be a mess to try to do that.

Bruce

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list