[BioC] Extracting expressions after a multtest procedure (part2)

J. Miguel Marin jmmarin at est-econ.uc3m.es
Wed Mar 19 18:52:59 CET 2008


Hello,

I think this is a valuable information.

Best

----- Mensaje reenviado de jmmarin at est-econ.uc3m.es -----
   Fecha: Mon, 17 Mar 2008 23:59:59 +0100
      De: "J. Miguel Marin" <jmmarin at est-econ.uc3m.es>
Responder-A: "J. Miguel Marin" <jmmarin at est-econ.uc3m.es>
Asunto: Re: [BioC] Extracting expressions after a multtest procedure   (part2)
    Para: Martin Morgan <mtmorgan at fhcrc.org>

Hello Martin,

you are absolutely right.

In fact, I need to identify just the correct genes because I am 
searching for significant differences in expressions in order to 
calculate distances among the same genes in "BCR/ABL" group and in 
"NEG" group.

Thank you very much again for your help.

Best.

> JUAN MIGUEL MARIN DIAZARAQUE wrote:
>> Helo Martin,
>>
>> In fact when I wrote
>>
>>> idx1 = res$index[res$adjp[,"BH"]<0.05,]
>> Error in res$index[res$adjp[, "BH"] < 0.05, ] :  incorrect number of 
>> dimensions
>
> I guess that should have been idx1=res$index[ res$adjp[,"BH"]<0.05] ]
>
> I know that Sean's suggestion works, in the sense that you get an 
> answer. I was not quite sure that it was the right answer -- res$adjp 
> is ordered from highest significance to lowest significance. When you
>
> idx2 = res$adjp[,"BH"]<0.05
>
> you get a logical vector (TRUE and FALSE values), probably all the 
> first ones are TRUE and all the later ones FALSE (because adjp was 
> ordered that way).
>
> > head(idx2, n=110)
>   [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
>  [13]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
>  [25]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
>  [37]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
>  [49]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
>  [61]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
>  [73]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
>  [85]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
>  [97]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
> [109] FALSE FALSE
>
> Now when you subset exprs(esetSub)[idx2,], you select all the rows of 
> exprs(esetSub) for which the corresponding entry in idx2 is 'TRUE'. 
> But since the order of genes in res$adjp is different from the order 
> of genes in exprs(esetSub), you won't get the right genes.
>
> Martin
>
>> Anyway, thank you very much for your help.
>>
>> Cheers
>>
>>> Off list, because I am not sure, but from the docs I think there 
>>> might be a couple of redirections
>>>
>>> idx1 = res$index[res$adjp[,"BH"]<0.05,]
>>>
>>> to get to resT[idx,], and maybe a second (or using the rownames) to 
>>> get back to the ExpressionSet.
>>>
>>> If you figure it out, Juan, and if there is a correction, it would 
>>> be good to post to the list again.
>>>
>>> Martin
>>>
>>> Sean Davis wrote:
>>>> On Sun, Mar 16, 2008 at 11:56 AM, JUAN MIGUEL MARIN DIAZARAQUE
>>>> <jmmarin at est-econ.uc3m.es> wrote:
>>>>> Hello,
>>>>>
>>>>>  In previous message I wonder how to extract just the expressions of 102
>>>>>  genes under two conditions after a multtest procedure.
>>>>>
>>>>>  At last, I wrote this unelegant code that seems to work:
>>>>>
>>>>>  quedan <- resT[res$adjp[,"BH"]<0.05,]
>>>>>  busco <- dimnames(quedan)[[1]]
>>>>>
>>>>>  cosa <- NULL
>>>>>  for (i in 1:length(dimnames(exprs(esetSub))[[1]]))
>>>>>         for (j in 1:length(busco))
>>>>>                 if (dimnames(exprs(esetSub[i,]))[[1]] == busco[j]) {
>>>>>                 cosa <- rbind(cosa, as.data.frame(exprs(esetSub))[i,]) }
>>>>>         end
>>>>>  end
>>>>
>>>> Generally, you can subset ExpressionSets just like you do data frames;
>>>> rows are genes or probes and columns are samples.  If esetSub is what
>>>> you used for your multtest procedure, you should be able to do:
>>>>
>>>> exprs(esetSub)[res$adjp[,'BH']<0.05,]
>>>>
>>>> In general, for loops are quite useful in R, but for things like
>>>> subsetting or vectorized operations, it is better to look for other
>>>> alternatives as they are likely to be faster.
>>>>
>>>> Sean
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives: 
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>
>>
>>
>>
>>
>> jm~
>>
>> _______________________________
>>
>>     Juan Miguel Marin
>>
>> http://www.est.uc3m.es/jmmarin
>>
>>    Dep. of Statistics
>> University Carlos III of Madrid
>>        Spain (E.U.)
>> _______________________________
>>
>>
>
>
> -- 
> Martin Morgan
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M2 B169
> Phone: (206) 667-2793
>

jm~

_______________________________

        J. Miguel Marin

http://www.est.uc3m.es/jmmarin

    Dep. of Statistics
University Carlos III of Madrid
        Spain (E.U.)
_______________________________



----- Terminar mensaje reenviado -----



More information about the Bioconductor mailing list