[BioC] Affy: probeset to gene expression with expresso

James W. MacDonald jmacdon at uw.edu
Tue Aug 13 19:47:45 CEST 2013


I don't think there is an answer to these questions. Well, I think there 
might be several hundred or maybe thousands of answers (e.g., for each 
gene that is measured more than once there might be something reasonable 
to do, based on what the duplicates are measuring), but we can only do 
things in aggregate, and I don't think there is a simple solution that 
can be applied en mass to all duplicated transcripts without making 
pretty strong assumptions.

Because of this, I tend to default to the status quo and just report 
probeset level data because I don't have any idea what the 'right' thing 
to do is.

Best,

Jim



On 8/13/2013 12:03 PM, Martin Preusse wrote:
> I am trying to figure out the same. There are ENDLESS publications dealing with exactly this topic.
>
> Obviously, different probes bind to different parts of the transcript. So they might represent different transcripts of the same gene or genomic locus.
>
> Maybe a mapping to transcript instead of gene is more useful. Another issue is that not all probes bind to the transcript with the same affinity. Some probes might even be pure noise. So if you average all of them the noise could cancel the signal from the more useful probes.
>
> I try to dig deeper into this, but there is to much stuff published … does one of you have tips for good papers/reviews? Or maybe good books that help getting into microarray analysis?
>
> Martin
>
>
> Am Dienstag, 13. August 2013 um 17:49 schrieb Helen Smith:
>
>> Hi,
>>
>> Thank you Jim.
>>
>> Can I ask, I have always averaged the expressions and they completed pathway analysis for the genes rather than the probes. Do you consider it better to leave it as individual probes and assess individual expression at the pathway level?
>> I'm torn as to which is the best approach,
>>
>> Thanks,
>> Helen
>>
>> -----Original Message-----
>> From: bioconductor-bounces at r-project.org [mailto:bioconductor-bounces at r-project.org] On Behalf Of James W. MacDonald
>> Sent: 13 August 2013 16:28
>> To: Martin Preusse
>> Cc: bioconductor at r-project.org (mailto:bioconductor at r-project.org)
>> Subject: Re: [BioC] Affy: probeset to gene expression with expresso
>>
>> Hi Martin,
>>
>> I just answered a very closely related question. See if this helps:
>>
>> https://stat.ethz.ch/pipermail/bioconductor/2013-August/054353.html
>>
>> Best,
>>
>> Jim
>>
>>
>>
>> On 8/13/2013 9:47 AM, Martin Preusse wrote:
>>> I am trying to get the gene level expression values from an Affy micro array, i.e. merge the values for probe sets representing the same gene.
>>>
>>> I tried to use the 'expresso' function from the affy package, but I always end up with an ExpressionSet containing probe sets, not genes.
>>>
>>> What is an easy way to summarize/merge probe sets to (entrez) genes?
>>>
>>>
>>> library(affydata)
>>> library(affy)
>>>
>>> # get the 'Dilution' affy batch
>>> data(Dilution)
>>>
>>> eset<- expresso(Dilution, bgcorrect.method='rma',
>>> normalize.method='constant', pmcorrect.method='pmonly',
>>> summary.method='avgdiff')
>>>
>>>
>>> write.exprs(eset,'testfile.txt')
>>>
>>>
>>> P.S.: I know it might not be the best idea to average probe sets, but
>>> I would like to try ;)
>>>
>>> Cheers
>>> Martin
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org (mailto:Bioconductor at r-project.org)
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>>
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> University of Washington
>> Environmental and Occupational Health Sciences
>> 4225 Roosevelt Way NE, # 100
>> Seattle WA 98105-6099
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org (mailto:Bioconductor at r-project.org)
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list