[R] ddply question

Brian Diggs brian.s.diggs at gmail.com
Tue Sep 2 20:06:08 CEST 2014


On 8/30/2014 2:11 PM, Felipe Carrillo wrote:
>   library(plyr)
> b <- structure(list(SampleDate = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L), .Label = "5/8/1996", class = "factor"), TotalCount = c(1L,
> 2L, 1L, 1L, 4L, 3L, 1L, 10L, 3L), ForkLength = c(61L, 22L, NA,
> NA, 72L, 34L, 100L, 23L, 25L), TotalSalvage = c(12L, 24L, 12L,
> 12L, 17L, 23L, 31L, 12L, 15L), Age = c(1L, 0L, NA, NA, 1L, 0L,
> 1L, 0L, 0L)), .Names = c("SampleDate", "TotalCount", "ForkLength",
> "TotalSalvage", "Age"), class = "data.frame", row.names = c(NA,
> -9L))
> b
> ddply(b,.(SampleDate,Age),summarise,salvage=sum(TotalSalvage),pct=TotalCount/sum(TotalCount))
> Error: expecting result of length one, got : 4

I get a slightly different error:

Error: length(rows) == 1 is not TRUE

but the problem is the same. sum returns a single value, while the 
computation for pct returns a vector the same length as TotalCount (the 
number of rows in the specific piece of b). summarise is designed to 
take a data frame and reduce the number of rows in it by 
aggregating/summarizing (some of) the columns. Since your two 
computations give different numbers of resulting rows, it errors out. It 
seems you don't want to reduce the number of rows, so replace summarise 
with mutate. That function can handle the different length return 
vectors and recycles appropriately.

(The other difference between summarise and mutate is that mutate keeps 
the original columns while summarise drops all original columns and 
returns only the computed ones; this makes sense given that summarise 
expects to return fewer rows than in the original data.)

> #Computing TotalCount inside ddply works but the pct seems wrong...
> ddply(b,.(SampleDate,Age),summarise,salvage=sum(TotalSalvage),Count=sum(TotalCount),pct=Count/sum(Count))
> 	[[alternative HTML version deleted]]


-- 
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University



More information about the R-help mailing list