[R] Data Manipulations - Group By equivalent

ronggui ronggui.huang at gmail.com
Mon Jul 3 06:59:27 CEST 2006


use doBy package will be more easy.

# GENERATE A TREATMENT GROUP #
group<-as.factor(paste("treatment", rep(1:2, 4), sep = '_'));
# CREATE A SERIES OF RANDOM VALUES #
x<-rnorm(length(group));
# CREATE A DATA FRAME TO COMBINE THE ABOVE TWO #
data<-data.frame(group, x);
library(doBy)
summ2<-summaryBy(x~group,data=data,FUN=c(mean,sum),na.rm=T,prefix=c("mean","sum"))
combine2<-merge(data,summ)

Ronggui


2006/7/2, Wensui Liu <liuwensui at gmail.com>:
> Zubin,
>
> I bet you are working for intercontinental hotels and think you probably are
> not the real Zubin there. right? ^_^. If you have chance, could you please
> say hi to him for me?
>
> Here is a piece of R code I copy from my blog side by side with SAS. You
> might need to tweak it a little to get what you need.
>
>  CALCULATE GROUP SUMMARY IN R
> ##################################################
> # HOW TO CALCULATE GROUP SUMMARY IN R #
> # DATE : DEC-13, 2005 #
> ##################################################
> # EQUIVALENT SAS CODE: #
> # #
> # DATA DATA; #
> # DO I = 1 TO 2; #
> # DO J = 1 TO 4; #
> # GROUP = 'TREATMENT_'||PUT(I, 1.); #
> # X = RANNOR(1); #
> # OUTPUT; #
> # END; #
> # END; #
> # KEEP GROUP X; #
> # RUN; #
> # #
> # PROC SQL; #
> # CREATE TABLE COMBINE AS #
> # SELECT *, MEAN(X) AS MEAN_X, SUM(X) AS SUM_X #
> # FROM DATA #
> # GROUP BY GROUP; #
> # QUIT; #
> ##################################################
>
>
> # GENERATE A TREATMENT GROUP #
> group<-as.factor(paste("treatment", rep(1:2, 4), sep = '_'));
>
> # CREATE A SERIES OF RANDOM VALUES #
> x<-rnorm(length(group));
>
> # CREATE A DATA FRAME TO COMBINE THE ABOVE TWO #
> data<-data.frame(group, x);
>
> # CALCULATE SUMMARY FOR X #
> x.mean<-tapply(data$x, data$group, mean, na.rm = T);
> x.sum<-tapply(data$x, data$group, sum, na.rm = T);
>
> # CREATE A DATA FRAME TO COMBINE SUMMARIES #
> summ<-data.frame(x.mean, x.sum, group = names(x.mean));
>
> # COMBINE DATA AND SUMMARIES TOGETHER #
> combine<-merge(data, summ, by = "group");
>
>
> On 7/1/06, zubin <binabina at bellsouth.net> wrote:
> >
> > Hello, a beginner R user - boy i wish there was a book on just data
> > manipulations for SAS users learning R (equivalent to the SAS DATA
> > STEP)..  Okay, my question:
> >
> > I have a panel data set, hotel data occupancy by month for 12 months,
> > 1000 hotels.  I have a field labeled 'year' and want to consolidate the
> > monthly records using an average into 1000 occupancy numbers - just a
> > simple average of the 12 months by hotel.  In SQL this operation is
> > pretty easy, a group by query (group by hotel where year = 2005, avg
> > occupancy) - how is this done in R? (in R language not SQL).  Thx!
> >
> > -zubin
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
>
>
>
> --
> WenSui Liu
> (http://spaces.msn.com/statcompute/blog)
> Senior Decision Support Analyst
> Health Policy and Clinical Effectiveness
> Cincinnati Children Hospital Medical Center
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>


-- 
»ÆÈÙ¹ó
Department of Sociology
Fudan University



More information about the R-help mailing list