[BioC] Inconsistency in RMA results from 'affy' and results from 'oligo'

Tue Aug 11 04:45:27 CEST 2009

On Mon, Aug 10, 2009 at 9:22 PM, Sean Davis<seandavi at gmail.com> wrote:
>
>
> On Mon, Aug 10, 2009 at 9:30 PM, Peng Yu <pengyu.ut at gmail.com> wrote:
>>
>> Hi Benilton,
>>
>> If both of scripts do not generate errors, does it mean that both of
>> them are correct? But the RMA results are very different, so one of
>> them must be wrong?
>
> The probesets for the genest arrays are designed roughly against exons and
> as Benilton pointed out, the current behavior of oligo is to summarize the
> probesets.  Given that there are >200k exons in the human genome, one should
> expect >200k rows in the rma-normalized data from oligo.  Affy, on the other
> hand, uses the unofficial cdf that you downloaded and summarizes to
> transcripts.  So, both are correct, but you are trying to compare apples to
> oranges.

I'm still confused what a probeset does not corresponds a transcript.
Could you give me a formal definition on probeset?

The book Bioinformatics and Computational Biology Solutions Using R
and Bioconductor says "Affymetrix arrays typically use between 11 and
20 probe pairs, referred to as a probeset, for each gene." But I don't
see a clear definition of probeset for all types of arrays listed on
http://bioconductor.org/docs/workflows/oligoarrays/

Another question: since 'affy' can also process Affymetrix Gene ST
Arrays, why only 'oligo' but not 'affy' is listed on the above
workflow page?

Regards,
Peng