[R] Select single probe-set with median expression from multiple probe-sets corresponding to same gene -AFFY

Martin Morgan mtmorgan at fhcrc.org
Thu Apr 4 05:47:34 CEST 2013


On 04/03/2013 08:34 PM, Martin Morgan wrote:
> On 04/03/2013 03:17 PM, Atul Kakrana wrote:
>> Hello All,
>>
>> I need your help. I am analysing affymetrix data and have to select the
>> probe-set that has median expression among all the probe-sets for same
>> gene. This way I want to remove the redundancy by keeping the analysis
>> to single gene entry level. I am fully aware that it is not a nice thing
>> to do but I just have to do it.
>>
>> To do so, I came across 'findLargest' function of 'genefilter' package
>> but it's not well documented; and I do not know how to implement the
>> 'findLargest' function. At this point I have:
>> esetRMA <- rma(mydata)
>>
>> Could anybody guide me on how can I select single probeset with median
>> expression from multiple probe-sets corresponding to single gene and
>> discard others? Is there any other way to achieve so i.e. other than
>> using 'genefilter'?
>>
>> Genefilter package:
>> http://www.bioconductor.org/packages/2.11/bioc/html/genefilter.html
>
> Hi Atul --It's a Bioconductor package, so might as well ask instead on the
> Bioconductor mailing list
>
>    http://bioconductor.org/help/mailing-list/
>
> As a reproducible example, load the "ALL" sample ExpressionSet, Biobase and
> genefilter packates
>
>    library(Biobase)
>    library(ALL)
>    library(genefilter)
>
> The three arguments to findLargest are the names of the probe sets
>
>    featureNames(ALL)
>
> the test statistic
>
>    rowMedians(ALL)
>
> and the chip from which the ExpressionSet is based
>
>    annotation(ALL)
>
> So the variable
>
>    idx = findLargest(featureNames(ALL), rowMedians(ALL), annotation(ALL)
>
> identifies the probes and
>
>    ALL1 = ALL[idx,]
>
> gets you the data you're interested in.
>
> Again, follow-up questions should go to the Bioconductor mailing list.

oops, a little quick off the draw, there. That gives the probe set with the 
largest median expression across samples; I'm not really sure what you're after 
-- the 'closest-to-median' probe set when averaging expression across samples? 
At any rate you'll get a more considered response on the Bioc mailing list, 
sorry for being misleading. Martin

>
> Martin
>
>
>>
>> Thanks
>>
>> AK
>>
>
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the R-help mailing list