[R] looping by grouping variable

jour4life jour4life at gmail.com
Wed Aug 31 18:50:34 CEST 2011


Hello all,

I hope something is not already posted regarding this exact problem I am
trying to solve. I've read through the forums and previous postings and am
still confused as to how to approach this. Basically, what I am trying to do
is construct variables that utilizes an average of a variable from a
grouping, or higher order, variable. For instance, in my dataset I have
variables, with each observation being a county. Of those counties, we have
an ID variable, for which, I have extracted variables from the substring of
the ID variable. Thus, I was able to extract a state variable, for which, I
want to use the averages, calculated at the state level, and utilize those
averages for another variable. I know this may be confusing, so I'm posting
an example dataset here:

id.tmp1<-as.character(01001:01010)
st<-substr(id,1,1)
cnty<-substr(id,2,5)
tfr10<-rnorn(1:10)

mydata<-cbind(id,st,cnty,tfr10)
print(mydata)
     id     st  cnty  tfr10               
 [1,] "1001" "1" "001" "1.07505442756833"  
 [2,] "1002" "1" "002" "-0.882434417011687"
 [3,] "1003" "1" "003" "2.29276525788035"  
 [4,] "1004" "1" "004" "-0.312320296652298"
 [5,] "1005" "1" "005" "1.09001860766383"  
 [6,] "1006" "1" "006" "-0.781940988103414"
 [7,] "1007" "1" "007" "-0.614135968631341"
 [8,] "1008" "1" "008" "0.515142965880679" 
 [9,] "1009" "1" "009" "0.0274456168157293"
[10,] "1010" "1" "010" "-0.538584996182184"

What I want to do is get the average for of the variable "tfr10" by state.
Based on that, I will create another calculation that will output variables.
In other words, for each observation, calculate a new variable using the
average at the state level. Of course, this is a simple example and will
have 32 states, for which I do not want to create a "mean variable" for each
state to calculate another variable and would rather do this using a loop. 

Or, I can potentially create a "mean" variable, but based on the
observations at the state level using a loop. Whichever way is best and
easiest. I hope that this example is understandable. Any help or direction
would be greatly appreciated!!!

Thanks,

Carlos

--
View this message in context: http://r.789695.n4.nabble.com/looping-by-grouping-variable-tp3781580p3781580.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list