[R] Competing with SPSS and SAS: improving code that loops through rows (data manipulation)

hadley wickham h.wickham at gmail.com
Sat Mar 27 18:39:21 CET 2010


> # Set up the ratio variables
> system.time({
> temp <- cbind(data, do.call(cbind, lapply(names(data)[3:4], function(.x)
>        {
>                unlist(by(data, data$group, function(.y) .y[,.x] / max(.y[,.x])))
>        })))
> colnames(temp)[5:6] <- paste(colnames(data)[3:4], 'ind.to.max', sep = '.')
> })
>

This part can be done quite straightforwardly with plyr:

library(plyr)
temp <- ddply(data, c("group"), transform,
  a.ind.to.max = a / max(a),
  b.ind.to.max = b / max(b))

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/



More information about the R-help mailing list