[R] Resources for optimizing code

Tony Plate tplate at acm.org
Fri Nov 5 19:46:36 CET 2004


Have you tried reading the manual "An Introduction to R", with special 
attention to "Array Indexing" (indexing for data frames is pretty similar 
to indexing for matrices).

Unless I'm misunderstanding, what you want to do is very simple.  It is 
possible to use numeric vectors with 0 and 1 to indicate whether you want 
to keep the row, but it's a little easier with logical vectors.  Here's an 
example:

 > x <- data.frame(a=1:5,b=letters[1:5])
 > keep.num <- ifelse(x$a %% 2 == 1, 1, 0)
 > keep.num
[1] 1 0 1 0 1
 > keep.logical <- (x$a %% 2) == 1
 > keep.logical
[1]  TRUE FALSE  TRUE FALSE  TRUE
 > x[keep.num==1,,drop=F]
   a b
1 1 a
3 3 c
5 5 e
 > x[keep.logical,,drop=F]
   a b
1 1 a
3 3 c
5 5 e
 >



At Friday 10:34 AM 11/5/2004, Janet Elise Rosenbaum wrote:

>I want to eliminate certain observations in a large dataframe (21000x100).
>I have written code which does this using a binary vector (0=delete obs,
>1=keep), but it uses for loops, and so it's slow and in the extreme it
>causes R to hang for indefinite time periods.
>
>I'm looking for one of two things:
>1.  A document which discusses how to avoid for loops and situations in
>which it's impossible to avoid for loops.
>
>or
>
>2.  A function which can do the above better than mine.
>
>My code is pasted below.
>
>Thanks so much,
>
>Janet
>
># asst is a binary vector of length= nrow(DATAFRAME).
># 1= observations you want to keep.  0= observation to get rid of.
>
>remove.xtra.f <-function(asst, DATAFRAME) {
>         n<-sum(asst, na.rm=T)
>         newdata<-matrix(nrow=n, ncol=ncol(DATAFRAME))
>         j<-1
>         for(i in 1:length(data)) {
>                 if (asst[i]==1) {
>                         newdata[j,]<-DATAFRAME[i,]
>                         j<-j+1
>                 }
>         }
>         newdata.f<-as.data.frame(newdata)
>         names(newdata.f)<-names(DATAFRAME)
>         return(newdata.f)
>}
>--
>Janet Rosenbaum                                 jerosenb at fas.harvard.edu
>PhD Candidate in Health Policy, Harvard GSAS
>Harvard Injury Control Research Center, Harvard School of Public Health
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




More information about the R-help mailing list