[R] How to eliminate for next loops in this script

Bert Gunter gunter.berton at gene.com
Wed Mar 7 20:11:39 CET 2012


I would certainly agree that given the post below, it is almost
impossible to offer much more than vague suggestions: simple, short,
reproducible examples per the posting guide rather than lengthy
complex code is required (although there are clever, perseverant folks
who may give it a go).

However, to be accurate, all of the apply family of functions, for
which ?by and ?aggregate are just wrappers, are still just basically
for/next loops at the R level. If efficiency is a consideration -- and
if it isn't, who cares? -- then vectorization (looping at the C level)
should be sought. For this, ?tabulate may be useful, but that's just a
guess. I also conjecture that careful attention to the data structure
in the below might allow efficient indexing or matrix calculations.
But I suppose that's what the OP was asking for, so that's probably
not much help. Again, a simplified example that helpeRs could easily
groc and offer neat solutions to, which might then be applied by the
OP to the real problem, would probably have a greater chance of
shedding light in the fog.

Cheers,
Bert

On Wed, Mar 7, 2012 at 10:41 AM, ilai <keren at math.montana.edu> wrote:
> ?by  ?aggregate
>
> On Tue, Mar 6, 2012 at 4:14 PM, Walter Anderson <wandrson01 at gmail.com> wrote:
>> I needed to compute a complicated cross tabulation to show weighted means
>> and standard deviations and the only method I could get that worked uses a
>> series of nested for next loops.  I know that there must be a better way to
>> do so, but could use some assistance pointing the way.
>>
>> Here is my working, but inefficient script:
>>
>> library(Hmisc)
>> rm(list=ls())
>> load('NHTS.Rdata')
>> day.wt <- day[c("HOUSEID","WTTRDFIN", "TRIPPURP","TRVLCMIN")]
>> hh.wt <- hh[c("HOUSEID","WTHHFIN","HHSIZE","HHVEHCNT","HOMETYPE")]
>> hh.wt$HHBIN <- with(hh.wt,{ cut(HHSIZE,
>> breaks=c(0,1,2,3,4,max(HHSIZE)),labels=c("1","2","3","4","5+"),ordered_result=TRUE)})
>> hh.wt$VEHBIN <- with(hh.wt,{ cut(HHVEHCNT,
>> breaks=c(-1,0,1,2,max(HHVEHCNT)),labels=c("0","1","2","3+"),ordered_result=TRUE)})
>> hh.wt$DUTYPE <- factor(hh.wt$HOMETYPE, exclude=c("-7","-8","-9"))
>> levels(hh.wt$DUTYPE) <- c("1","1","2","2","2","2","2")  # Convert home
>> types to 1=SF and 2=MF
>> trips <- merge(day.wt,hh.wt,by="HOUSEID")
>> trips$PURPOSE <- factor(trips$TRIPPURP, exclude=c("-9"))
>> trips <- trips[c("HOUSEID","HHBIN","VEHBIN","DUTYPE","PURPOSE","WTTRDFIN",
>> "WTHHFIN")]
>> sink('TripRates.csv')
>> cat("HHBIN,VEHBIN,DUTYPE,TRIPPURP,COUNT,MEAN,STD.DEV\n")
>> for (per in levels(trips$HHBIN))
>>  for (veh in levels(trips$VEHBIN))
>>    for (hom in levels(trips$DUTYPE))
>>      for (pur in levels(trips$PURPOSE))
>>      {
>>        cat(per)
>>        cat(",")
>>        cat(veh)
>>        cat(",")
>>        cat(hom)
>>        cat(",")
>>        cat(pur)
>>        cat(",")
>>        tmp <- subset(trips,
>>                      trips$HHBIN == per & trips$VEHBIN == veh &
>>                        trips$DUTYPE == hom & trips$PURPOSE == pur,
>>                      select=c("HOUSEID","WTTRDFIN", "WTHHFIN"))
>>        cat(length(tmp$HOUSEID))
>>        cat(",")
>>        tmp$tr <- (tmp$WTTRDFIN/365)/tmp$WTHHFIN
>>        w.m <- wtd.mean(tmp$tr,weights=tmp$WTHHFIN)
>>        w.sd <- sqrt(wtd.var(tmp$tr, weights=tmp$WTHHFIN))
>>        cat(w.m)
>>        cat(",")
>>        cat(w.sd)
>>        cat("\n")
>>      }
>> sink()
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list