[R] Getting group-wise standard scores of a vector

jim holtman jholtman at gmail.com
Thu Sep 27 19:57:30 CEST 2007


Is this what you want:

> test.data
          x group
1  32.66782     A
2  50.02132     A
3  43.69700     A
4  46.59031     A
5  38.43428     A
6  68.03142     A
7  46.68868     A
8  33.94487     A
9  51.97193     A
10 52.63176     A
11 40.14173     B
12 21.11079     B
13 43.59518     B
14 55.70508     B
15 49.40277     B
16 49.01821     B
17 55.60821     B
18 38.13541     B
19 60.96777     B
20 49.94656     B
> test.data$z <-  ave(test.data$x, test.data$group,FUN=function(z){(z-mean(z))/sd(z)})
> test.data
          x group           z
1  32.66782     A -1.33241490
2  50.02132     A  0.34308231
3  43.69700     A -0.26753700
4  46.59031     A  0.01181557
5  38.43428     A -0.77565765
6  68.03142     A  2.08197466
7  46.68868     A  0.02131284
8  33.94487     A -1.20911451
9  51.97193     A  0.53141612
10 52.63176     A  0.59512257
11 40.14173     B -0.54637780
12 21.11079     B -2.21770874
13 43.59518     B -0.24308968
14 55.70508     B  0.82042267
15 49.40277     B  0.26694269
16 49.01821     B  0.23317041
17 55.60821     B  0.81191546
18 38.13541     B -0.72257633
19 60.96777     B  1.28260182
20 49.94656     B  0.31469951
>


On 9/27/07, Wayne.W.Jones at shell.com <Wayne.W.Jones at shell.com> wrote:
>
> tapply is also very useful:
>
>
> my.df<-data.frame(x=rnorm(20, 50, 10),group=factor(sort(rep(c("A", "B"), 10))))
> tapply(my.df$x,my.df$group,function(x){(x-mean(x))/sd(x)})
>
>
>
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org]On Behalf Of Matthew Dubins
> Sent: 26 September 2007 21:57
> To: r-help at r-project.org
> Subject: [R] Getting group-wise standard scores of a vector
>
>
> Hi,
>
> I want to be able to create a vector of z-scores from a vector of
> continuous data, conditional on a group membership vector.
>
> Say you have 20 numbers distributed normally with a mean of 50 and an sd
> of 10:
>
> x <- rnorm(20, 50, 10)
>
>
> Then you have a vector that delineates 2 groups within x:
>
> group <- sort(rep(c("A", "B"), 10))
>
> test.data <- data.frame(cbind(x, group))
>
> I know that if you break up the x vector into 2 different vectors then
> it becomes easy to calculate the z scores for each vector, then you
> stack them and append them to the original
> data frame.  Is there anyway to apply this sort of calculation without
> splitting the original vector up?  I tried a really complex ifelse
> statement but it didn't seem to work.
>
> Thanks in advance,
> Matthew Dubins
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list