[R] column statistics

ONKELINX, Thierry Thierry.ONKELINX at inbo.be
Mon Dec 7 11:35:15 CET 2009


You have several options in R.

1) cast from the reshape package

cast(Factor1 + Factor2 ~ . , data = your.data.frame, value = "Value",
fun = mean) 

2) ddply from the plyr package

ddply(your.data.frame, c("Factor1", "Factor2"),
function(x){mean(x$Value)})

HTH,

Thierry

------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-----Oorspronkelijk bericht-----
Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
Namens Ivan Calandra
Verzonden: maandag 7 december 2009 11:25
Aan: r-help at r-project.org
Onderwerp: [R] column statistics

Hi everybody,

I would like to compute the mean for 1 variable between the rows with
the same levels.

For example, with the dataset below:
Factor1      Factor2      Value
A               X               1
A               X               2
A               Y               3
A               Y               4
B               X               5
B               X               6
B               Y               7
B               Y               8

I would like to get:
Factor1      Factor2      Value
A               X               1.5
A               Y               3.5
B               X               5.5
B               Y               7.5

Up to now, I worked in Statistica and Systat, and it was called "column
statistics" in Statistica (and I had a script with the "BY" function in
Systat).

Of course it is a simplified case. For my dataset I have 4 factors and
15 variables, so a general method would be nice. However, my skills are
not that great, so if you could please give some explanations (I mean
other than what is in the ?function of course).

Thanks a lot in advance
Ivan

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.




More information about the R-help mailing list