[R] Complicated For Loop (to me)

Petr PIKAL petr.pikal at precheza.cz
Tue Nov 10 08:27:06 CET 2009


Hi

You probably may use some of aggregate functions (by, tapply, aggregate)

aggregate(some.columns.of.data frame, list(SLUNCH, ETHNIC, RACE, 
DIVISION), function(x) x/sum(x))

Untested on your data.

Regards
Petr


r-help-bounces at r-project.org napsal dne 10.11.2009 03:51:55:

> 
> Sorry, I've been trying to work around this and just got back to check 
my
> email.
> 
> dput wasn't working too well for me because the data set also has 450
> variables and I needed more time to figure out how to properly show you 
all
> what you needed to know.
> 
> But to show you the idea, a very simple data set would be:
> 
> NWEIGHT  ETHNIC   RACE   SLUNCH   DIVISION .......
> 1234            0           1         1               1
> 2345            1           1         0               5
> 3243            0           3         1               3
>    .                .           .          .                .
>    .                .           .          .                .
>    .                .           .          .                . 
>    .                .           .          .                .
> 
> 
> So basically, I already have the data subset by division and race. (I 
did
> that the inefficient way by coding it by hand)
> 
> But now I need to calculate the percentage of each division (by race) 
that
> participates in SLUNCH (a 0 1 variable)
> 
> So I am trying to avoid writing out code such as:
> 
> w.cd1.s <- sum(ifelse(white.cd1$SLUNCH==1, white.cd1$NWEIGHT,
> 0))/sum(white.cd1$NWEIGHT)
> w.cd2.s <- sum(ifelse(white.cd2$SLUNCH==1, white.cd2$NWEIGHT,
> 0))/sum(white.cd2$NWEIGHT)
> .... for all the variables. 
> 
> One other method that I tried, which gets me the "names" i need, but 
doesn't
> put them into a dataframe (which I am currently trying to fix) is by 
using
> this code:
> 
> 
> names <- c("white","black","hispanic","asian")
> regions <- c("cd1","cd2","cd3","cd4","cd5","cd6","cd7","cd8","cd9")
> type <- c("l", "p", "r")
> name.region <- c()
> for (j in 1:length(names)){
>    for(i in 1:length(regions)){
>       for(k in 1:length(type)){
>       name.holder <- paste(names[j],".",paste(regions[i],".", type[k], 
sep=""),
> sep="")
>       name.region <- c(name.region, name.holder)
>       }
>    }
> }
> 
> (The "l", "p", "r" represent other variables that I am trying to do the 
same
> thing as SLUNCH)
> 
> >From here I've been trouble-shooting how to switch these named 
variables
> back into a data.frame context. 
> 
> Everyone's help has been really appreciated!  I've learned a lot today 
that
> will hopefully move me slowly from using for loops to more efficient
> functions.  I unfortunately am still learning those and have some 
knowledge
> about how to use loops compared to almost no knowledge of the more 
powerful
> functions like sapply, lapply, etc.  (I'm waiting on MASS4 to be 
returned to
> the library to read it.)
> 
> 
> Thanks!
> 
> 
> John Kane-2 wrote:
> > 
> > I think that we probably need a sample database of your original data. 
 
> > A few lines of the dataset would probably be enough as long as it was
> > fairly representative of the overall data set.  See ?dput for a way of
> > conveniently supply a sample data set.
> > 
> > Otherwise off the top of my head, I would think that you could just 
put
> > all your subsets into a list and use lapply  but I'm simply guessing
> > without seeing the data.
> > 
> > --- On Mon, 11/9/09, agm. <amurray at vt.edu> wrote:
> > 
> >> From: agm. <amurray at vt.edu>
> >> Subject: Re: [R] Complicated For Loop (to me)
> >> To: r-help at r-project.org
> >> Received: Monday, November 9, 2009, 3:18 PM
> >> 
> >> I've looked through ?split and run all of the code, but I
> >> am not sure that I
> >> can use it in such a way to make it do what I need. 
> >> Another suggestion was
> >> using "lists", but again, I am sure that the process can do
> >> what I need, but
> >> I am not sure it would work with so many observations.
> >> 
> >> I might have been too simple in my code.  Let me try
> >> to explain it more
> >> clearly:
> >> 
> >> I've got a data set of 4500 observations.  I have
> >> already subset it into
> >> race/ethnicity (which I did by simple code).  Now I
> >> needed to subset each
> >> race/ethnicity again into 9 separate regions.  I again
> >> did this by simple
> >> code.
> >> 
> >> The problem is now, I need to calculate a percentage for
> >> three different
> >> variables for all 9 regions for each race.  I was
> >> trying to do this through
> >> a loop command.
> >> 
> >> So a snippet of my code is :
> >> 
> >> names <- c("white", "black", "asian", "hispanic")
> >> for(j in 1:length(names)){
> >> for(i in 1:9){
> >> names[j].cd[i].es.wash <- subset(names[j].cd[i],
> >> SLUNCH==1)
> >> es.cd[i].names.w <-
> >> sum(names.cd[i].es.wash$NWEIGHT)/sum(names.cd[i]$NWEIGHT)
> >> }
> >> }
> >> 
> >> 
> >> Maybe that makes it clearer.  If not, I
> >> apologize.  Thanks for the help that
> >> I have already received.  It is greatly appreciated.
> >> 
> >> Tony
> >> 
> >> -- 
> >> View this message in context:
> >> 
http://old.nabble.com/Complicated-For-Loop-%28to-me%29-tp26269479p26272994.html

> >> Sent from the R help mailing list archive at Nabble.com.
> >> 
> >> ______________________________________________
> >> R-help at r-project.org
> >> mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained,
> >> reproducible code.
> >> 
> > 
> > 
> > __________________________________________________________________
> > Make your browsing faster, safer, and easier with the new Internet
> > Explorer® 8. Optimized for Yahoo! Get i
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > 
> > 
> 
> -- 
> View this message in context: 
http://old.nabble.com/Complicated-For-Loop-%
> 28to-me%29-tp26269479p26277512.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list