[R] summary per group

Petr PIKAL petr.pikal at precheza.cz
Tue Jan 3 08:45:57 CET 2012


Hi

> 
> > Hi
> > 
> > > 
> > > Hello,
> > > 
> > > I know that it'll be quite easy to do what I want but somehow I am 
lost 
> > as
> > > I am new to R. I want to get summary results arranged by groups. In 
> > detail
> > > I'd like get the number (levels) of Species per Family like for this 

> > dataset:
> > > 
> > > SPEC <- factor(c("a","a","b","b","c","c","c","d","e","e","e","e"))
> > > FAM <- factor(c("A","A","A","A","B","B","B","C","C","C","C","C"))
> > > df <- data.frame(SPEC,FAM)
> > > 
> > > I tried tapply(SPEC, FAM, nlevels).. but it is not the result I am 
> > looking for...
> > > 
> > > What is the easiest way to do that? Do I have to rearrange the 
dataset?
> > 
> > To do what? Do you want number of unique entries within each level of 
FAM?
> > If yes
> > 
> > sapply(tapply(SPEC, FAM, unique), length)
> > 
> > can do this.
> > 
> > Regards
> > Petr
> 
> Thank you Petr,
> 
> that is exactly what I was looking for... no I played a little bit 
around 
> with that because I want to create a summary with FAM as a grouping 
> variable. Beside the number of unique SPEC per FAM also want to get 
their 
> levels as text. So far I know I have following:
> 
> paste(unique(SPEC), collapse = ', ')
> 
> But how can I use that in combination with tapply and furthermore with 
cbind like:
> 
> SPEC <- factor(c("a","a","b","b","c","c","c","d","e","e","e","e"))
> FAM <- factor(c("A","A","A","A","B","B","B","C","C","C","C","C"))
> df <- data.frame(SPEC,FAM)
> 
> with(df, cbind("Number of SPEC"=sapply(tapply(SPEC,FAM,unique),length), 
> "SPECs"=tapply(SPEC,FAM,unique)))

Quite close

with(df, cbind("Number of SPEC"=sapply(tapply(SPEC,FAM,unique),length),
"SPECs"=sapply(tapply(SPEC,FAM,unique), paste, collapse=",")))

You can easily look on chunks of this code

with(df, sapply(tapply(SPEC,FAM,unique),length))
with(df, sapply(tapply(SPEC,FAM,unique), paste, collapse=","))
with(df, tapply(SPEC,FAM,unique))

actually you could do better with
res <- with(df, tapply(SPEC,FAM,unique))

and

cbind("Number of SPEC"=sapply(res,length),"SPECs"=sapply(res, paste, 
collapse=","))

One comment (quite important). With cbind the result is matrix. Matrix is 
a vector with dimensions and as such it can have only values of one type. 
In this case you end with character values so the numbers are converted to 
character. If you want it to stay nubers you need to use data.frame

see
?matrix
?data.frame

and R-intro especially part regarding objects and their properties.

Regards
Petr






> 
> The result should look like:
>     Number of SPEC SPECs
> A   2              "a, b"
> B   1              "c"
> C   2              "d, e"
> 
> Thank you,
> 
> /johannes
> 
> 
> -- 
> NEU: FreePhone - 0ct/min Handyspartarif mit Geld-zurück-Garantie! 
> Jetzt informieren: http://www.gmx.net/de/go/freephone



More information about the R-help mailing list