[R] looping using combinatorics

Jesse Albert Canchola jesse.canchola.b at bayer.com
Fri Jul 14 22:39:39 CEST 2006


Great.  I will work on the problem with those definitions in mind. Thanks 
for your help, Gabor.  I'll post the final solution when it is ready.

Best,
Jesse




"Gabor Grothendieck" <ggrothendieck at gmail.com> 
07/14/2006 11:59 AM

To
"Jesse Albert Canchola" <jesse.canchola.b at bayer.com>
cc
r-help at stat.math.ethz.ch
Subject
Re: [R] looping using combinatorics






If a, b and c are numeric vectors then rbind(a,b,c) and cbind(a,b,c) 
produce
matrices, not data frames and iterating over a matrix in a for loop 
iterates
over the elements of the matrix whereas iterating over a data frame in
a for loop iterates over the columns.  You can use 
as.data.frame(my.matrix)
to convert a matrix to a data frame.

Compare:

> for(i in matrix(1:4,2)) print(i)
[1] 1
[1] 2
[1] 3
[1] 4
> for(i in as.data.frame(matrix(1:4,2))) print(i)
[1] 1 2
[1] 3 4



On 7/14/06, Jesse Albert Canchola <jesse.canchola.b at bayer.com> wrote:
> Thanks.  It is actually the rows I want to choose from, not the columns
> (the columns will remain the same with the same names). A slighly
> abbreviated and modifed example:
>
> data frame "a" has
> ID  meas index
> 1   1.1  1
> 2   2.1  1
>
> data frame "b" has
> ID meas index
> 3  1.2  2
> 4  2.2  2
>
> data frame "c" has
> ID meas index
> 5 1.3  3
> 6 1.4  3
>
> rbind the three frames "a", "b", and "c" into "d":
> ID  meas  index
> 1   1.1  1
> 2   2.1  1
> 3   1.2   2
> 4   2.2   2
> 5  1.3   3
> 6  1.4   3
>
> The three (3 choose 2) pairs we want will be as follows.
> Using "combn" from the "combinat" package on CRAN, we get the pairs 
(1,2),
> (1,3), (2,3) which can be used as the index in the "for" loop (as you 
have
> used below):
> In this case, the pairs (1,2) refer to the actual subset of the data 
frame
> "d", above, where the actual variable named index=1 or index=2 (and so 
on
> for the other pairs).
>
> So the firstly chosen pair would be (1,2) and the resulting subset of 
the
> data frame "d" looks like this:
> ID  meas  index
> 1   1.1  1
> 2   2.1  1
> 3   1.2   2
> 4   2.2   2
>
> and so on for the other pairs.  So the "rbind" is correct it is the 
"for"
> loop that needs to be modified to grab the subsets from the DF frame 
below
> (i.e., the (1,2) pair selected by the combn function will select data 
from
> DF where the actual variable index=1 or index=2 ; using the example
> above).
>
> #### BEGIN CODE ####
> DF <- rbind(a,b,c)
> DF
> for(index in as.data.frame(combn(3,2))) print(DF[,index])
> #### END CODE ####
>
>
>
> Regards,
> Jesse
>
>
>
>
>
>
>
>
> "Gabor Grothendieck" <ggrothendieck at gmail.com>
> 07/14/2006 11:01 AM
>
> To
> "Jesse Albert Canchola" <jesse.canchola.b at bayer.com>
> cc
> r-help at stat.math.ethz.ch
> Subject
> Re: [R] looping using combinatorics
>
>
>
>
>
>
> Use data.frame, not rbind, e.g. DF <- data.frame(a, b, c)
>
> On 7/14/06, Jesse Albert Canchola <jesse.canchola.b at bayer.com> wrote:
> > Many thanks, Gabor.  This is very close to what would be ideal.  You
> gave
> > me an idea as follows:
> >
> > Rather than combine pairs of data vectors/frames AFTER the "combn"
> > function, combine all data before (though I belive this would be less
> > efficient) and add an index then use that index to choose your pairs 
(or
> > whatever combinatorics you are using; e.g., 8 choose 4  so all
> > combinations of 4 out of 8 for a total of 70 combinations.)
> >
> > Example data frames with variable names:
> >
> > Data frame "a" where I add an "index":
> > id measure index
> > 1  1.1  1
> > 2  1.2  1
> > 3  1.3  1
> >
> > Data frame "b" where I add an "index":
> > id measure index
> > 4  2.1  2
> > 5  2.2  2
> > 6  2.3  2
> >
> > Data frame "c" where I add an index:
> > id measure index
> > 7  3.1  3
> > 8  3.2  3
> > 9  3.3  3
> >
> > If we combine all these data at once using rbind, we get:
> > id measure index
> > 1  1.1  1
> > 2  1.2  1
> > 3  1.3  1
> > 4  2.1  2
> > 5  2.2  2
> > 6  2.3  2
> > 7  3.1  3
> > 8  3.2  3
> > 9  3.3  3
> >
> > We can then use something similar to your code and the index to choose
> the
> > required pairs as derived from the "combn" function.
> > For example, the "combn" function will choose the data pairs
> > (1,2)
> > (1,3)
> > (2,3)
> >
> > where, for example, the pairs (1,2) will have data from frames "a" and
> > "b":
> > 1  1.1  1
> > 2  1.2  1
> > 3  1.3  1
> > 4  2.1  2
> > 5  2.2  2
> > 6  2.3  2
> >
> > so that we can go down the list subsetting what we need and doing
> > operations on each combined pair as we go.
> >
> > Is there an easy way in R to do this operation?
> >
> > For the above, an attempt might be:
> >
> > ########## STAB CODE #########
> > DF <- rbind(a,b,c)
> > DF
> > for(index in as.data.frame(combn(3,2))) print(DF[,index])
> > ######## END STAB CODE ######
> >
> > but this is choosing 3 choose 2 COLUMNS within the combined file 
rather
> > than 3 choose 2 ROWS.
> >
> >
> > Best regards and TIAA,
> > Jesse
> >
> >
> >
> >
> >
> > "Gabor Grothendieck" <ggrothendieck at gmail.com>
> > 07/13/2006 07:39 PM
> >
> > To
> > "Jesse Albert Canchola" <jesse.canchola.b at bayer.com>
> > cc
> > r-help at stat.math.ethz.ch
> > Subject
> > Re: [R] looping using combinatorics
> >
> >
> >
> >
> >
> >
> > I assume your question is given 3 vectors of the same length: a, b and 
c
> > how do we loop over pairs of them.  In the following each iteration
> > displays
> > one pair:
> >
> >   library(combinat)
> >   DF <- data.frame(a = 1:4, b = 5:8, c = 9:12)
> >   for(idx in as.data.frame(combn(3,2))) print(DF[,idx])
> >
> > On 7/13/06, Jesse Albert Canchola <jesse.canchola.b at bayer.com> wrote:
> > > I have a problem where I need to loop over the total combinations of
> > > vectors (combined once chosen via combinatorics).  Here is a
> > > simplification of the problem:
> > >
> > > STEP 1:  Define three vectors a, b, c.
> > > STEP 2:  Combine all possible pairwise vectors (i.e., 3 choose 2 = 3
> > > possible pairs of vectors: ab,ac, bc)
> > > NOTE:  the actual problem has 8 choose 4, 8 choose 5 and 8 choose 6
> > > combinations.
> > > STEP 3:  Do the same math on each pairwise combination and spit out
> > > answers each time
> > >
> > > ####### BEGIN CODE #######
> > > #STEP 1
> > > a1 <- c(1,2,3,4,5,6,7,8,9,10,11,12)
> > > a <- matrix(a1,2,3,byrow=T)
> > > a
> > >
> > > b1 <- c(13,14,15,16,17,18,19,20,21,22,23,24)
> > > b <- matrix(b1,2,3,byrow=T)
> > > b
> > >
> > > c1 <- c(25,26,27,28,29,30,31,32,33,34,35,36)
> > > c <- matrix(b1,2,3,byrow=T)
> > > c
> > >
> > > # example:  combine the first two vectors "a" and "b"
> > > combab <- rbind(a,b)
> > >
> > > # the a,b combined data from the algorithm later below should look
> like
> > > # something like the following:
> > > combab
> > >
> > > # use the combinatorics "combn" function found in the "combinat"
> package
> > > on CRAN
> > > m <- combn(3,2) # three choose two combinations
> > > m
> > >
> > > # the first assignment below should be numeric and then subsequent
> > > # assignments as character since the first time you assign a number 
to
> > > # a character in a matrix the rest of the numbers in the matrix are
> > > coerced to character
> > > m[m==1]='a'; m[m=='2']='b'; m[m=='3']='c'
> > > m
> > >
> > > #STEP 2: combine pairwise vectors into a matrix or frame
> > > for (i in dim(m)[1])
> > >    for (j in dim(m)[2])
> > >        {
> > >            combined <-
> > > rbind(cat(format(m[i]),"\n"),cat(format(m[j]),"\n")) #cat/format
> removes
> > > the quotes
> > >            combined
> > >        }
> > > traceback()
> > >
> > >
> > > #STEP 3: {not there yet}
> > > ################# END CODE ################
> > >
> > > The problem is that in STEP 2 (not complete), the results in the 
rbind
> > are
> > > not recognized as the objects they represent (i.e., the "a" without
> > quotes
> > > is not recognized as the data object we defined in STEP 1.  Perhaps
> this
> > > is a parsing problem.  Perhaps there is an alterative way to do 
this.
> I
> > > looked pretty long and hard in the CRAN libraries but alas, I am
> stuck.
> > > BTW, I picked up R about a month ago (I used primarily SAS, Stata 
and
> > > SPSS).
> > >
> > > Regards and TIA,
> > > Jesse
> > >
> > >
> > >
> > >
> > >
> > >
> > > Jesse A. Canchola
> > > Biostatistician III
> > > Bayer Healthcare
> > > 725 Potter St.
> > > Berkeley, CA 94710
> > > P: 510.705.5855
> > > F: 510.705.5718
> > > E: Jesse.Canchola.b at Bayer.Com
> > >
> > >
> > >
> > >
> > >
> >
> 
_______________________________________________________________________________________________
> > >
> > > The information contained in this e-mail is for the exclusive use of
> the
> > intended recipient(s) and may be confidential, proprietary, and/or
> legally
> > privileged.  Inadvertent disclosure of this message does not 
constitute
> a
> > waiver of any privilege.  If you receive this message in error, please
> do
> > not directly or indirectly use, print, copy, forward, or disclose any
> part
> > of this message.  Please also delete this e-mail and all copies and
> notify
> > the sender.  Thank you.
> > >
> > > For alternate languages please go to
> http://bayerdisclaimer.bayerweb.com
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> > >
> >
> >
> >
>
>
>



More information about the R-help mailing list