[R] Difficult subset challenge

Noah Silverman noahsilverman at ucla.edu
Sat Dec 10 22:44:49 CET 2011


I'm having difficulty coming up with a good way to subest some data to generate statistics.

My data frame has multiple observations by group.

Here is an overly-simplified toy example of the data
code	v1	v2
G1		1.2	2.3
G1		0	2.4
G1		1.4	3.4
G2		2.9	2.3
G2		4.3	4.4

I want to normalize the data *by group*  for certain variable.  But, I want to ignore 0 values when calculating the mean and standard deviation.

What I *want* to do is something like this:
	 for (code in unique (d$code) ){ 
		 mu <- mean( d[which(d[d$code==code,v1] !=0 ), v1] ) 
		 sig <- sd( d[which(d[d$code==code,v1] !=0 ), v1] ) 
		 d[which(d[d$code==code,v1] !=0 ), cname] <- (d[which(d[d$code==code,v1] !=0 ), v1] - mu) / sig

My goal, if it isn't apparent, is to replace values with their normalized value.  (But, the statistics used for normalization are calculated skipping zero values.)

This doesn't work as the indexing from the which command is relative (1,2,3, etc.)


Noah Silverman
UCLA Department of Statistics
8208 Math Sciences Building
Los Angeles, CA 90095

More information about the R-help mailing list