[R] User-defined functions in dplyr

Axel Urbiz axel.urbiz at gmail.com
Fri Oct 30 00:55:19 CET 2015


Hello,

Sorry, resending this question as the prior was not sent properly.

I’m using the plyr package below to add a variable named "bin" to my
original data frame "df" with the user-defined function "create_bins". I'd
like to get similar results using dplyr instead, but failing to do so.

set.seed(4)
df <- data.frame(pred = rnorm(100), models = gl(2, 50, 100, labels =
c("model1", "model2")))


### Using plyr (works fine)
create_bins <- function(x, nBins)
{
  Breaks <- unique(quantile(x$pred, probs = seq(0, 1, 1/nBins)))
  dfB <-  data.frame(pred = x$pred,
                     bin = cut(x$pred, breaks = Breaks, include.lowest =
TRUE))
  dfB
}

nBins = 10
res_plyr <- plyr::ddply(df, plyr::.(models), create_bins, nBins)
head(res_plyr)

### Using dplyr (fails)

by_group <- dplyr::group_by(df, models)
res_dplyr <- dplyr::summarize(by_group, create_bins, nBins)
Error: not a vector


Any help would be much appreciated.

Best,
Axel.

	[[alternative HTML version deleted]]



More information about the R-help mailing list