[BioC] How to sort a matrix based on its column names and preserving the identical column names

Oleg Sklyar osklyar at ebi.ac.uk
Fri Aug 3 18:22:49 CEST 2007


Yes, I know sort does not change values -- in contrast to order it
returns sorted values and not indexes, and I pointed out the correct
solution. However, what I meant by working hal-ways is, colnames()
results in a character vector and sort rearranges elements in this
vector. matrix allows to access columns by names and thus rearranging
elements and using sort in this case should be similar (but is wrong if
names not unique as example shows) to using order and accessing columns
by indexes. In this case if any other values and not the colnames was
used with sort (say values in the first row), it would result in total
mess as sorted would be the values that do not identify columns:


R version 2.6.0 Under development (unstable) (2007-07-30 r42359) 
> a<-matrix(runif(16),4,4)
> colnames(a)<-c("c","b","a","c")
> a[,colnames(a)] ## correct if unique names, here wrong
             c         b         a         c
[1,] 0.6674110 0.1693423 0.5741207 0.6674110
[2,] 0.4479471 0.1374272 0.1149747 0.4479471
[3,] 0.4328296 0.4990545 0.2777478 0.4328296
[4,] 0.8944030 0.1354652 0.4950811 0.8944030
> a[,sort(colnames(a))] ## correct if unique names
             a         b         c         c
[1,] 0.5741207 0.1693423 0.6674110 0.6674110
[2,] 0.1149747 0.1374272 0.4479471 0.4479471
[3,] 0.2777478 0.4990545 0.4328296 0.4328296
[4,] 0.4950811 0.1354652 0.8944030 0.8944030
> a[,order(colnames(a))] ## correct 
             a         b         c          c
[1,] 0.5741207 0.1693423 0.6674110 0.43714271
[2,] 0.1149747 0.1374272 0.4479471 0.79047094
[3,] 0.2777478 0.4990545 0.4328296 0.02128344
[4,] 0.4950811 0.1354652 0.8944030 0.93321638
>

On Fri, 2007-08-03 at 09:05 -0700, Robert Gentleman wrote:
> Hi,
>    I think that there is some confusion here. First sort does not change 
> any values, it sorts. The changing of column and/or row names is "a 
> feature", that comes with the [ operator.  Details are given on the man 
> page for [.data.frame.
> 
>    I do not believe that there is any such behavior (currently) for 
> matrices. And I could not replicate the behavior described by Carol 
> except with data.frames, which is the documented (if peculiar) behavior.
> 
> 
>   best wishes
>     Robert
> 
> m=matrix(rnorm(25), nc=5)
> colnames(m) = rep("A", 5)
> m[,2]
> m[1,]
> m[,sort(colnames(m))]
> 
> but,
>   y = data.frame(m)
>   y
> #changes the colnames
>   y = data.frame(m, check.names=FALSE)
> 
> # but then do change the names
>   y[1,]
>   y[,sort(colnames(y))]
> 
> 
> 
> Oleg Sklyar wrote:
> > mat[, order(colnames(mat))]
> > 
> > order provides indexes of sorted elements without affecting their names.
> > Sort works in your case half ways only because you are sorting
> > characters, and columns can be identified by characters, otherwise it is
> > wrong.
> > 
> > Here is the working example:
> > 
> >> x<-c("col","abc","def","abc","col")
> >> a<-matrix(runif(25),ncol=5,nrow=5)
> >> colnames(a) <- x
> >> a
> >            col        abc       def        abc       col
> > [1,] 0.9815985 0.65855865 0.9982046 0.07781167 0.3228944
> > [2,] 0.5970836 0.19195563 0.9082061 0.80489513 0.9190933
> > [3,] 0.8147790 0.13499074 0.9431437 0.41154237 0.6487952
> > [4,] 0.7661668 0.26216671 0.6694043 0.66462428 0.2177653
> > [5,] 0.5604505 0.04371932 0.7873665 0.44849293 0.1700327
> >> a[,order(x)]
> >             abc        abc       col       col       def
> > [1,] 0.65855865 0.07781167 0.9815985 0.3228944 0.9982046
> > [2,] 0.19195563 0.80489513 0.5970836 0.9190933 0.9082061
> > [3,] 0.13499074 0.41154237 0.8147790 0.6487952 0.9431437
> > [4,] 0.26216671 0.66462428 0.7661668 0.2177653 0.6694043
> > [5,] 0.04371932 0.44849293 0.5604505 0.1700327 0.7873665
> > 
> >> order(x)
> > [1] 2 4 1 5 3
> > 
> > On Fri, 2007-08-03 at 05:36 -0700, carol white wrote:
> >> Hello,
> >> How to sort a matrix based on its column names and preserving the identical column names.
> >>
> >> when I use mat [, sort(colnames(mat))], sort changes all column names to unique ones. for ex,  if the name of 2 columns is col, the 2nd will be changed to col.1 whereas I want to keep the col name for the two columns
> >>
> >> col    col -> col   col.1
> >>
> >> thanks
> >>
> >>        
> >> ---------------------------------
> >> Park yourself in front of a world of choices in alternative vehicles.
> >>
> >> 	[[alternative HTML version deleted]]
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
-- 
Dr. Oleg Sklyar * EBI-EMBL, Cambridge CB10 1SD, England * +44-1223-494466



More information about the Bioconductor mailing list