[R] Apply function to every 20 rows between pairs of columns in a matrix

arun smartpink111 at yahoo.com
Tue Nov 12 04:40:05 CET 2013



Hi,
May be this what you wanted.
res2 <- lapply(row.names(res[[1]]),function(x) do.call(rbind,lapply(res,function(y) y[match(x, row.names(y)),])))
 length(res2)
#[1] 48
 dim(res2[[1]])
#[1] 2325    8

A.K.


On Monday, November 11, 2013 10:20 PM, Yu-yu Ren <renyangsu at gmail.com> wrote:

Thank you so much for that script, it works great. One additional request; how can I go about binding each of the 2325 matrices for each sample, resulting in 48 matrices of 8 column by 2325 row?




On Mon, Nov 11, 2013 at 10:02 PM, arun <smartpink111 at yahoo.com> wrote:


>
>Hi,
>I already sent a reply to R-help.  I am not sure about the "2342".
>
>set.seed(25)
>dat1 <- as.data.frame(matrix(sample(c("A","T","G","C"),46482*56,replace=TRUE),ncol=56,nrow=46482),stringsAsFactors=FALSE)
> lst1 <- split(dat1,as.character(gl(nrow(dat1),20,nrow(dat1))))
>res <- lapply(lst1,function(x) sapply(x[,1:8],function(y) sapply(x[,9:56], function(z) sum(y==z)/20)))
>
> length(res)
>#[1] 2325  ### check here
> dim(res[[1]])
>#[1] 48  8
>
>A.K.
>
>
>
>
>On Monday, November 11, 2013 10:00 PM, Yu-yu Ren <renyangsu at gmail.com> wrote:
>
>Thank you, I have uploaded several example files, with intermediate outputs of what I have done and the logic flow.
>
>
>
>
>On Mon, Nov 11, 2013 at 9:37 PM, <smartpink111 at yahoo.com> wrote:
>
>
>>Hi,
>>
>>Comparing the first 8 columns separately with 9-56 columns is not clear.  Also, please provide a reproducible example (using ?dput) for others to work on.
>>
>>A.K.
>><quote author='Renyulb28'>
>>Hi all, I have a set of genetic SNP data that looks like
>>
>>Founder1 Founder2 Founder3 Founder4 Founder5 Founder6 Founder7 Founder8
>>Sample1 Sample2 Sample3 Sample...
>>A A A T T T T T A T A T
>>A A A T T T T T A T A T
>>A A A T T T T T A T A T
>>A A A T T T T T A T A T
>>A A A T T T T T A T A T
>>A A A T T T T T A T A T
>>A A A T T T T T A T A T
>>A A A T T T T T A T A T
>>A A A T T T T T A T A T
>>A A A T T T T T A T A T
>>A A A T T T T T A T A T
>>A A A T T T T T A T A T
>>
>>The size of the matrix is 56 columns by 46482 rows. I need to first bin the
>>matrix by every 20 rows, then compare each of the first 8 columns (founders)
>>to each columns 9-56, and divide the total number of matching
>>letters/alleles by the total number of rows (20). Ultimately I need 48 8
>>column by 2342 row matrices, which are essentially similarity matrices. I
>>have tried to extract each pair separately by something like
>>
>>"length(cbind(odd[,9],odd[,1])[cbind(odd[,9],cbind(odd[,9],odd[,1])[,1])[,1]=="T"
>>& cbind(odd[,9],odd[,1])[,2]=="T",])/nrow(cbind(odd[,9],odd[,1]))"
>>
>>but this is no where near efficient, and I do not know of a faster way of
>>applying the function to every 20 rows and across multiple pairs.
>>
>>In the example given above, if the rows were all identical like shown across
>>20 rows, then the first row of the matrix for Sample1 would be
>>
>>1 1 1 0 0 0 0
>>
>></quote>
>>Quoted from:
>>http://r.789695.n4.nabble.com/Apply-function-to-every-20-rows-between-pairs-of-columns-in-a-matrix-tp4680272.html
>>
>>
>>_____________________________________
>>Sent from http://r.789695.n4.nabble.com
>>
>>
>



More information about the R-help mailing list