[R] help, please! matrix operations inside 3 nested loops

R. Michael Weylandt michael.weylandt at gmail.com
Wed Aug 8 17:21:43 CEST 2012


On Wed, Aug 8, 2012 at 9:06 AM, Fridolin <smells_like_rock at gmx.net> wrote:
> hello, this is my script:
>
> #1) read in data:
> daten<-read.table('K:/Analysen/STRUCTURE/input_STRUCTURE_tab_excl_5_282_559.txt',
> header=TRUE, sep="\t")
> daten<-as.matrix(daten)
>
> #2) create empty matrix:
> indxind<-matrix(nrow=617, ncol=617)
> indxind[1:20,1:19]
>
> #3) compare cells to each other, score:
> for (s in 3:34) {   #walks though the matrix colum by colum, starting at
> colum 3
>   for (z1 in 1:617) {  #for each current colum, take one row (z1)...
>     for (z2 in 1:617) {  #...and compare it to another row (z2) of the
> current colum
>       if (z1!=z2) {topf<-indxind[z1,z2]
>                    if (daten[2*z1-1,s]==daten[2*z2-1,s]) topf<-topf+1
> #actually, 2 rows make up 1 individual,
>                    if (daten[2*z1-1,s]==daten[2*z2,s]) topf<-topf+1
> #therefore i compare 2 rows
>                    if (daten[2*z1,s]==daten[2*z2-1,s]) topf<-topf+1
> #with another 2 rows
>                    if (daten[2*z1,s]==daten[2*z2,s]) topf<-topf+1
>                    indxind[z1,z2]<-topf
>                    indxind[z2,z1]<-topf
>                   }
>       #print(c(s,z1,z2,indxind[1,2])) ##counts s, z1 and z2 properly, but
> gives NA for indxind[1,2]
>       }
>     #indxind[1:5,1:5] #empty matrix
>   }
>   #indxind[1:5,1:5] #empty matrix
>   }
>
> #4) check:
> indxind[1:5,1:5]
>
> this results no errors, but my matrix indxind remains empty (only NAs).
> though all columns and rows are counted properly. R needs quite a while to
> get through all this (there are probably smarter and faster ways to
> calculate this but i am not too deep into R and bioinformatics, and i need
> to calculate this only once). could the 3 for-loops already be too
> computationally intense for adding matrix operations?
>
> any help would be much appreciated!
>
> thx, frido
>
>

Hi Frido,

I'm afraid I get a little lost in your code, but I'd be willing to bet
we can cut the loops out entirely and speed things up.

Can you give us a "big picture" description of the algorithm you're
implementing as well as (if it's not too hard) a small reproducible
example [1]?

Note also that most of us don't use Nabble so you'll need to
explicitly quote any relevant context.

Thanks,
Michael

[1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example



More information about the R-help mailing list