[BioC] Sorting matrix by column

Tue Oct 23 21:32:32 CEST 2012

If you want to annotate data, an easier way to do it is to use the 
annaffy package - you can output either text or HTML tables. I have some 
functions in affycoretools to automate going from a MArrayLM object to 
the HTML or text tables if you are interested.

Best,

Jim

On 10/23/2012 2:32 PM, Kasoji, Manjula (NIH/NCI) [C] wrote:
> Thanks, guys. I think I got that because I did a cbind() with my ebayes()
> results and my annotation results from mget() that used to annotate my
> significant genes from the mogene10sttranscriptcluster db.
>
> I'll try out a few things. If you guys have any further suggestions or
> recommendations I will certainly appreciate them.
>
> Thanks!
>
> On 10/23/12 11:57AM, "Axel Klenk"<axel.klenk at actelion.com>  wrote:
>
>> Dear Manjula,
>>
>> wow. How did you create that? :-)
>>
>> order() doesn't like lists:
>>
>>> order(list(1:3))
>> Error in order(list(1:3)) : unimplemented type 'list' in 'orderVector1'
>>
>> and I think you should try to make your x look something like the
>> data.frame that Jim has used in his example and it will work.
>>
>> Cheers,
>>
>> Axel (not Alex!!) Klenk
>> Research Informatician
>> Information Management Drug Discovery
>>
>> Actelion Pharmaceuticals Ltd. • Gewerbestrasse 16 • CH-4123 Allschwil
>> • Switzerland
>> G12.O1.R10
>>
>> axel.klenk at actelion.com • www.actelion.com
>> Address for visitors: Hegenheimermattweg 92
>>
>>
>> On Tue, Oct 23, 2012 at 5:45 PM, Kasoji, Manjula (NIH/NCI) [C]
>> <manjula.kasoji at nih.gov>  wrote:
>>> Hi Alex,
>>>
>>> Please see the output below:
>>>
>>>> str(x)
>>>
>>> List of 80
>>>   $ : chr "10371400"
>>>   $ : chr "10453900"
>>>   $ : chr "10375051"
>>>   $ : chr "10575211"
>>>   $ : chr "10566254"
>>>   $ : chr "10602372"
>>>   $ : chr "10398428"
>>>   $ : chr "10383518"
>>>   $ : chr "10397054"
>>>   $ : chr "10384020"
>>>   $ : chr "10608710"
>>>   $ : chr "10363762"
>>>   $ : chr "10375058"
>>>   $ : chr "10381603"
>>>   $ : chr "10442373"
>>>   $ : chr "10421227"
>>>   $ : chr "10534966"
>>>   $ : chr "10398408"
>>>   $ : chr "10398418"
>>>   $ : chr "10572772"
>>>   $ : chr "Lypla1"
>>>   $ : chr "Tcea1"
>>>   $ : chr "Atp6v1h"
>>>   $ : chr "Oprk1"
>>>
>>>> class(x[,2])
>>> [1] "list"
>>>
>>>
>>>
>>>
>>> On 10/23/12 11:42AM, "Axel Klenk"<axel.klenk at actelion.com>  wrote:
>>>
>>>> Dear Guest,
>>>>
>>>> I think your approach is valid in general and it is your x that is
>>>> causing the
>>>> problem; column 'Gene Symbol' appears to contain two values. What is the
>>>> result of
>>>>
>>>> str(x)
>>>>
>>>> and/or
>>>>
>>>> class(x[,2])
>>>>
>>>> ?
>>>>
>>>> Cheers,
>>>>
>>>> - axel
>>>>
>>>>
>>>> Axel Klenk
>>>> Research Informatician
>>>> Information Management Drug Discovery
>>>>
>>>> Actelion Pharmaceuticals Ltd. € Gewerbestrasse 16 € CH-4123 Allschwil
>>>> € Switzerland
>>>> G12.O1.R10
>>>>
>>>> axel.klenk at actelion.com € www.actelion.com
>>>> Address for visitors: Hegenheimermattweg 92
>>>>
>>>>
>>>>
>>>> On Tue, Oct 23, 2012 at 5:15 PM, Guest [guest]<guest at bioconductor.org>
>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I would like to sort a matrix by a specific column (column 2). I tried
>>>>> the order() function, but I get an error. I think it is because the
>>>>> values in column 2 are not numeric, they are gene symbols. This may be
>>>>> a
>>>>> general R question, but I thought I would post it here since it is
>>>>> microarray data analysis.
>>>>>
>>>>> I have matrix x:
>>>>>
>>>>>> x
>>>>>           ID         Gene Symbol     logFC      Adj.PVal
>>>>> 10344624 "10371400" "Lypla1"        0.3592492  0.9999522
>>>>> 10344633 "10453900" "Tcea1"         0.1886117  0.9999522
>>>>> 10344637 "10375051" "Atp6v1h"       0.6713107  0.9999522
>>>>> 10344653 "10575211" "Oprk1"         -0.2342731 0.9999522
>>>>> 10344658 "10566254" "Rb1cc1"        1.790676   0.9999522
>>>>> 10344674 "10602372" "Fam150a"       1.397496   0.9999522
>>>>> 10344679 "10398428" "St18"          -0.3278807 0.9999522
>>>>> 10344707 "10383518" "Pcmtd1"        -0.2231074 0.9999522
>>>>> 10344713 "10397054" "Ahcy"          -0.1844897 0.9999522
>>>>> 10344723 "10384020" "Rrs1"          -0.2322781 0.9999522
>>>>> 10344725 "10608710" "Adhfe1"        0.5993566  0.9999522
>>>>> 10344741 "10363762" "Hnrnpa3"       -0.2660978 0.9999522
>>>>> 10344743 "10375058" "3110035E14Rik" 0.9178868  0.9999522
>>>>> 10344750 "10381603" "Sgk3"          -0.2961638 0.9999522
>>>>> 10344772 "10442373" "6030422M02Rik" -0.1653454 0.9999522
>>>>> 10344789 "10421227" "Cspp1"         -0.1480766 0.9999522
>>>>> 10344799 "10534966" "Cspp1"         -0.2436361 0.9999522
>>>>> 10344801 "10398408" "Cspp1"         -0.4040665 0.9999522
>>>>> 10344803 "10398418" "Cspp1"         -0.2556627 0.9999522
>>>>> 10344805 "10572772" "Cspp1"         -0.1864641 0.9999522
>>>>>
>>>>> I want to sort on the "Gene Symbol" column so that I can remove the
>>>>> duplicates and keep the one with the highest log fold change.
>>>>>
>>>>> I tried the following and received an error.
>>>>>> x[order(x[,2]),]
>>>>> Error in order(x[, 2]) : unimplemented type 'list' in 'orderVector1'
>>>>>
>>>>> If anyone has any suggestions for an easy way to sort a significant
>>>>> gene list, remove duplicated values, and keep the value with highest
>>>>> fold change, that would be helpful!
>>>>>
>>>>> I've posted my session info below.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Guest
>>>>>
>>>>>   -- output of sessionInfo():
>>>>>
>>>>>> sessionInfo()
>>>>> R version 2.15.1 (2012-06-22)
>>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>>>>
>>>>> locale:
>>>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>>>
>>>>> attached base packages:
>>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>>
>>>>> loaded via a namespace (and not attached):
>>>>> [1] tools_2.15.1
>>>>>
>>>>> --
>>>>> Sent via the guest posting facility at bioconductor.org.
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>> --
>>>>
>>>> The information of this email and in any file transmitted with it is
>>>> strictly confidential and may be legally privileged.
>>>> It is intended solely for the addressee. If you are not the intended
>>>> recipient, any copying, distribution or any other use of this email is
>>>> prohibited and may be unlawful. In such case, you should please notify
>>>> the
>>>> sender immediately and destroy this email.
>>>> The content of this email is not legally binding unless confirmed by
>>>> letter.
>>>> Any views expressed in this message are those of the individual sender,
>>>> except where the message states otherwise and the sender is authorised
>>>> to
>>>> state them to be the views of the sender's company. For further
>>>> information
>>>> about Actelion please see our website at http://www.actelion.com
>>>>
>> -- 
>>
>> The information of this email and in any file transmitted with it is
>> strictly confidential and may be legally privileged.
>> It is intended solely for the addressee. If you are not the intended
>> recipient, any copying, distribution or any other use of this email is
>> prohibited and may be unlawful. In such case, you should please notify
>> the
>> sender immediately and destroy this email.
>> The content of this email is not legally binding unless confirmed by
>> letter.
>> Any views expressed in this message are those of the individual sender,
>> except where the message states otherwise and the sender is authorised to
>> state them to be the views of the sender's company. For further
>> information
>> about Actelion please see our website at http://www.actelion.com
>>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099