[R] Sapply

Duncan Murdoch murdoch at stats.uwo.ca
Mon Aug 31 00:47:34 CEST 2009


On 30/08/2009 6:08 PM, Noah Silverman wrote:
> Hi,
> 
> I need a bit of guidance with the sapply function.  I've read the help 
> page, but am still a bit unsure how to use it.
> 
> I have a large data frame with about 100 columns and 30,000 rows.  One 
> of the columns is "group" of which there are about 2,000 distinct "groups".
> 
> I want to normalize (sum to 1) one of my variables per-group.
> 
> Normally, I would just write a huge "for each" loop, but have read that 
> is hugely inefficient with R.

Don't believe what you read, try it.  If the for loop takes 100 times 
longer than the fastest method, but it still only takes 10 seconds, is 
it worth optimizing?

Duncan Murdoch

> 
> The old way would be (just an example, syntax might not be perfect):
> 
> for (group in data$group){
>      for (score in data[data$group == group]){
>          new_score <- score / sum(data$score[data$group==group])
>      }
> }
> 
> How would I simplify this with sapply?
> 
> Thanks!
> 
> --
> Noah
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list