# [BioC] sort the difference and save to individual files problem

Fri Jul 30 15:42:47 CEST 2004

```Put everything in a matrix and then use apply() family to find index
with highest. You need to add one more line to my function just before
the return(results) :
rownames(results) <- rownames(m)
so you output will have rownames. Then something like this would work.

pairwise.difference <- function(m){
npairs  <- choose( ncol(m), 2 )
results <- matrix( NA, nc=npairs, nr=nrow(m) )
cnames  <- rep(NA, npairs)
if( is.null(colnames(m)) ) colnames(m) <- paste("col", 1:ncol(m),
sep="")

k <- 1
for(i in 1:ncol(m)){
for(j in 1:ncol(m)){
if(j <= i) next;
results[ ,k] <- m[ ,i] - m[ ,j]
cnames[k]    <- paste(colnames(m)[ c(i, j) ], collapse=".vs.")
k <- k + 1
}
}

colnames(results) <- cnames
rownames(results) <- rownames(m)
return(results)
}

# Example using a matrix with 5 gene/row and 4 columns
mat <- matrix( sample(1:20), nc=4 )
colnames(mat) <- LETTERS[1:4]
rownames(mat) <- paste( "g", 1:5, sep="")

mat
A  B  C  D
g1 10 16  3 15
g2 18  5 12 19
g3  7  4  8 13
g4 14  2  6 11
g5 17  1 20  9

(out <- pairwise.difference(mat))
A.vs.B A.vs.C A.vs.D B.vs.C B.vs.D C.vs.D
g1     -6      7     -5     13      1    -12
g2     13      6     -1     -7    -14     -7
g3      3     -1     -6     -4     -9     -5
g4     12      8      3     -4     -9     -5
g5     16     -3      8    -19     -8     11

# Now show the 3 genes with largest absolute value in each column

apply(abs(out), 2, function(x) names(x[order(-x)]) [ 1:3 ])
A.vs.B A.vs.C A.vs.D B.vs.C B.vs.D C.vs.D
[1,] "g5"   "g4"   "g5"   "g5"   "g2"   "g1"
[2,] "g2"   "g1"   "g3"   "g1"   "g3"   "g5"
[3,] "g4"   "g2"   "g1"   "g2"   "g4"   "g2"

This says that g5 had the largest absolute difference between A and B
followed by g2 and so on. If you want the whole list, remove the [ 1:3 ]
part from the code above.

Viewing this output is easier than viewing 100 files and lets you see
the genes that are picked up most frequently.

On Fri, 2004-07-30 at 13:24, Dr_Gyorffy_Balazs wrote:
>
> thank you for the help!
>
> However, this way I have a big table with all the data in
> it. The problem is, that I have also the gene names (in the
> first column of the initial table), and I would like to
> have not only the differnce, but also the ranked difference
> with the gene names. So at the end I would know, which gene
> had the biggest difference (or the smallest). I was
> thinking to save in different files in order to keep the
> gene names.
>
> (You are right, I really don't need 100 columns. It seemed
> for me more simple to construct the function to get 100
> results instead of correcting for simmetrical- and
> self-tests. :-))
>
> Balazs
>
>
>
>
>
> ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun!  http://uk.messenger.yahoo.com
>

```