[R] Need to calculate standard deviation by groups
Gerrit.Eichner at math.uni-giessen.de
Fri Dec 9 10:17:51 CET 2011
does ave() with its argument FUN supplied with sd not do what you want?
with( Dataset, ave( x = B, C, D, FUN = sd))
should do it.
Hth -- Gerrit
On Fri, 9 Dec 2011, Zsuzsanna Papp wrote:
> please help me with this basic question, I already spent two days on the
> internet and textbooks trying to come up with an answer...
> I will simplify my question to an example, rather than base it on the
> original variable names.
> I have a Dataset with 4 variables, 20000 cases. Variable A is an ID.
> Variable B is a continuous numerical variable, unique to each A.
> Variable C is categorical factor, has 6 possible levels. Variable D is
> also categorical factor, has 300 different levels.
> I would like to create a new variable=E, which is the standard deviation
> of B around the group means of B, groups defined by C and D.
> I had no problem creating such column to get group means (with the ave()
> function), but can not find a solution for another function like sd that
> would assign proper group value to each case.
> I tried
> Dataset$E <- with(Dataset, tapply(B, list(C,D),FUN=sd))
> but it is wrong, as it takes the 1800 different SD values, puts them in
> column E, then puts the same array of numbers there below it, repeats as
> many times as possible until the column is filled. The SD values are not
> corresponding to the proper groups.
> How can I match these data (1800 different SD values) to their
> corresponding cases in my original data?
> Is there a shortcut to do this all in one line, as for the means with
> the ave() function?
> I also tried ddply but I am doing something wrong (my R is on Linux and
> do not yet know how to get error messages, so I do not know what is
> wrong with my lines).
> Thank you for any help! Please give me as detailed script as possible.
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Dr. Gerrit Eichner Mathematical Institute, Room 212
gerrit.eichner at math.uni-giessen.de Justus-Liebig-University Giessen
Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany
Fax: +49-(0)641-99-32109 http://www.uni-giessen.de/cms/eichner
More information about the R-help