[R]: z-scores for different factor levels

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Fri Jan 18 12:11:28 CET 2002


"Stuart Leask" <stuart.leask at nottingham.ac.uk> writes:

> Hi there.
> I am trying to generate z-scores for a variable according to it's factor
> level, leaving this result in the original dataframe.
> 
> ie. standardised birth weight for gestational age in weeks.
> BWT is birthweight
> GEST is gestational age in weeks (equivalent to the factor level)
> 
> I can generate the factor level mean & SD using tapply
> tapply(BWT,GEST,mean) etc.
> but this creates a new array of means & SDs.
> 
> - Can anyone suggest how I can slot these means & SDs by factor level
> straight back into the original dataframe, so I can then subtract the mean &
> divide by the SD to get a Z-score for each case?
> - Is there a function already available that can generate z-scores by factor
> levels?
> 
> Stuart

I'd try something along the lines of

n <- length(BWT)
BWTz <- numeric(n)
BWTz[unlist(split(1:n,GEST))] <- unlist(lapply(split(BWT,GEST),scale))

or, probably better

n <- length(BWT)
BWTz <- numeric(n)
for (i in split(1:n,GEST))
       BWTz[i] <- scale(BWT[i])


[The cute way would be 

BWTz <- unsplit(lapply(split(BWT,GEST),scale),GEST)

but someone would have to write unsplit() first...]
-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list