[BioC] Sorting matrix by column

James W. MacDonald jmacdon at uw.edu
Tue Oct 23 17:56:53 CEST 2012


Weird. I don't see how that is possible, as it appears you have both 
character and numeric in that matrix, which is not allowed. You are 
getting the error that you should get if x[,2] is a list:

 > x2 <- list(rnorm(100), letters)
 > x2[order(x2)]
Error in .Method(..., na.last = na.last, decreasing = decreasing) :
   unimplemented type 'list' in 'orderVector1'

I would recommend starting again - your x object seems to be busted somehow.

Best,

Jim



On 10/23/2012 11:42 AM, Kasoji, Manjula (NIH/NCI) [C] wrote:
> I get:
>
>> class(x)
> [1] "matrix"
>
>
>
> On 10/23/12 11:41AM, "James W. MacDonald"<jmacdon at uw.edu>  wrote:
>
>> What do you get from
>>
>> class(x)
>>
>> On 10/23/2012 11:38 AM, Kasoji, Manjula (NIH/NCI) [C] wrote:
>>> Hi Jim,
>>>
>>> The R session info below does correspond to the session I pasted. When I
>>> tried your suggestion, I still get an error:
>>>
>>>> x[base::order(x[,2]),]
>>> Error in base::order(x[, 2]) :
>>>     unimplemented type 'list' in 'orderVector1'
>>>
>>>
>>> I see that you don't have quotes around the ID and Gene Symbol names in
>>> your matrix. Is there a way to remove the quotes?
>>>
>>> Thanks!
>>>
>>> On 10/23/12 11:27AM, "James W. MacDonald"<jmacdon at uw.edu>   wrote:
>>>
>>>> On 10/23/2012 11:15 AM, Guest [guest] wrote:
>>>>> Hi,
>>>>>
>>>>> I would like to sort a matrix by a specific column (column 2). I tried
>>>>> the order() function, but I get an error. I think it is because the
>>>>> values in column 2 are not numeric, they are gene symbols. This may
>>>>> be a
>>>>> general R question, but I thought I would post it here since it is
>>>>> microarray data analysis.
>>>>>
>>>>> I have matrix x:
>>>>>
>>>>>> x
>>>>>             ID         Gene Symbol     logFC      Adj.PVal
>>>>> 10344624 "10371400" "Lypla1"        0.3592492  0.9999522
>>>>> 10344633 "10453900" "Tcea1"         0.1886117  0.9999522
>>>>> 10344637 "10375051" "Atp6v1h"       0.6713107  0.9999522
>>>>> 10344653 "10575211" "Oprk1"         -0.2342731 0.9999522
>>>>> 10344658 "10566254" "Rb1cc1"        1.790676   0.9999522
>>>>> 10344674 "10602372" "Fam150a"       1.397496   0.9999522
>>>>> 10344679 "10398428" "St18"          -0.3278807 0.9999522
>>>>> 10344707 "10383518" "Pcmtd1"        -0.2231074 0.9999522
>>>>> 10344713 "10397054" "Ahcy"          -0.1844897 0.9999522
>>>>> 10344723 "10384020" "Rrs1"          -0.2322781 0.9999522
>>>>> 10344725 "10608710" "Adhfe1"        0.5993566  0.9999522
>>>>> 10344741 "10363762" "Hnrnpa3"       -0.2660978 0.9999522
>>>>> 10344743 "10375058" "3110035E14Rik" 0.9178868  0.9999522
>>>>> 10344750 "10381603" "Sgk3"          -0.2961638 0.9999522
>>>>> 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522
>>>>> 10344789 "10421227" "Cspp1"         -0.1480766 0.9999522
>>>>> 10344799 "10534966" "Cspp1"         -0.2436361 0.9999522
>>>>> 10344801 "10398408" "Cspp1"         -0.4040665 0.9999522
>>>>> 10344803 "10398418" "Cspp1"         -0.2556627 0.9999522
>>>>> 10344805 "10572772" "Cspp1"         -0.1864641 0.9999522
>>>>>
>>>>> I want to sort on the "Gene Symbol" column so that I can remove the
>>>>> duplicates and keep the one with the highest log fold change.
>>>>>
>>>>> I tried the following and received an error.
>>>>>> x[order(x[,2]),]
>>>>> Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1'
>>>> I am not sure the sessionInfo() you give below corresponds to the
>>>> session above. I get:
>>>>
>>>>> x<- data.frame(ID = 12345:12354, Gene =
>>>> Rkeys(mogene10sttranscriptclusterSYMBOL)[5001:5010], logFC = rnorm(10),
>>>> pval = runif(10))
>>>>> x
>>>>         ID   Gene       logFC      pval
>>>> 1  12345  Sepw1  0.56914952 0.4916910
>>>> 2  12346  Serf1  0.83929962 0.4816986
>>>> 3  12347 Gm4748  0.12462117 0.9372249
>>>> 4  12348   Sez6 -0.21468480 0.4921201
>>>> 5  12349  Foxp3 -1.36283694 0.4575675
>>>> 6  12350  Sfpi1  1.03632565 0.5251826
>>>> 7  12351  Sfrp1  0.04689108 0.3068112
>>>> 8  12352   Frzb  0.08379607 0.1509499
>>>> 9  12353  Sfrp4 -1.61513620 0.9336235
>>>> 10 12354  Srsf2  1.56222316 0.2571122
>>>>> x[order(x[,2]),]
>>>>         ID   Gene       logFC      pval
>>>> 5  12349  Foxp3 -1.36283694 0.4575675
>>>> 8  12352   Frzb  0.08379607 0.1509499
>>>> 3  12347 Gm4748  0.12462117 0.9372249
>>>> 1  12345  Sepw1  0.56914952 0.4916910
>>>> 2  12346  Serf1  0.83929962 0.4816986
>>>> 4  12348   Sez6 -0.21468480 0.4921201
>>>> 6  12350  Sfpi1  1.03632565 0.5251826
>>>> 7  12351  Sfrp1  0.04689108 0.3068112
>>>> 9  12353  Sfrp4 -1.61513620 0.9336235
>>>> 10 12354  Srsf2  1.56222316 0.2571122
>>>>
>>>> It appears you have something loaded that thinks you want to use the
>>>> orderVector1() function. You can always specify the function you are
>>>> intending with the :: operator (in this case, you want base::order()).
>>>>
>>>>> x[base::order(x[,2]),]
>>>>         ID   Gene       logFC      pval
>>>> 5  12349  Foxp3 -1.36283694 0.4575675
>>>> 8  12352   Frzb  0.08379607 0.1509499
>>>> 3  12347 Gm4748  0.12462117 0.9372249
>>>> 1  12345  Sepw1  0.56914952 0.4916910
>>>> 2  12346  Serf1  0.83929962 0.4816986
>>>> 4  12348   Sez6 -0.21468480 0.4921201
>>>> 6  12350  Sfpi1  1.03632565 0.5251826
>>>> 7  12351  Sfrp1  0.04689108 0.3068112
>>>> 9  12353  Sfrp4 -1.61513620 0.9336235
>>>> 10 12354  Srsf2  1.56222316 0.2571122
>>>>
>>>> Best,
>>>>
>>>> Jim
>>>>
>>>>
>>>>> If anyone has any suggestions for an easy way to sort a significant
>>>>> gene list, remove duplicated values, and keep the value with highest
>>>>> fold change, that would be helpful!
>>>>>
>>>>> I've posted my session info below.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Guest
>>>>>
>>>>>     -- output of sessionInfo():
>>>>>
>>>>>> sessionInfo()
>>>>> R version 2.15.1 (2012-06-22)
>>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>>>>
>>>>> locale:
>>>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>>>
>>>>> attached base packages:
>>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>>
>>>>> loaded via a namespace (and not attached):
>>>>> [1] tools_2.15.1
>>>>>
>>>>> --
>>>>> Sent via the guest posting facility at bioconductor.org.
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>> -- 
>>>> James W. MacDonald, M.S.
>>>> Biostatistician
>>>> University of Washington
>>>> Environmental and Occupational Health Sciences
>>>> 4225 Roosevelt Way NE, # 100
>>>> Seattle WA 98105-6099
>>>>
>> -- 
>> James W. MacDonald, M.S.
>> Biostatistician
>> University of Washington
>> Environmental and Occupational Health Sciences
>> 4225 Roosevelt Way NE, # 100
>> Seattle WA 98105-6099
>>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list