[BioC] Inconsistency in RMA results from 'affy' and results from 'oligo'

Peng Yu pengyu.ut at gmail.com
Tue Aug 11 05:24:00 CEST 2009


On Mon, Aug 10, 2009 at 9:53 PM, Steve
Lianoglou<mailinglist.honeypot at gmail.com> wrote:
> Hi,
>
> On Aug 10, 2009, at 9:30 PM, Peng Yu wrote:
>
>> Hi Benilton,
>>
>> If both of scripts do not generate errors, does it mean that both of
>> them are correct? But the RMA results are very different, so one of
>> them must be wrong?
>
> Benilton, I think, has provided some hints which you can chew on for a bit.
> Sometimes it helps to digest some of the information and try to *really*
> figure out what it *means* instead of hoping to jump straight to an answer.
> Let's see if we can't do that with the information we have here:
>
>> On Mon, Aug 10, 2009 at 5:25 PM, Benilton Carvalho<bcarvalh at jhsph.edu>
>> wrote:
>>>
>>> Dear Peng,
>>>
>>> I can speak for oligo and the annotation package used by it.
>>>
>>> The current release of oligo summarizes to the probeset level. The next
>>> release of oligo and annotation packages will allow you to summarize to
>>> the
>>> gene level. In your particular case, the count you'll get is roughly 35K.
>
> He mentions summaries at "the probeset level," and then "the gene level." He
> also mentions that the upcoming release of oligo WILL allow you to summarize
> to THE GENE LEVEL, which I guess suggests that currently oligo is providing
> summaries at the probeset level. Once you can get GENE LEVEL summaries, you
> will get ROUGHLY 35K VALUES.
>
> Now, you say your files are very different, and that one (or both!) of these
> package must be wrong, since you've looked at this:
>
> $ wc gene_expr_affy.txt gene_expr_oligo.txt
> 34761   312848  5002519 gene_expr_affy.txt
> 234591  2111318 33763075 gene_expr_oligo.txt
> 269352  2424166 38765594 total
>
> It looks like your current oligo summary has many more numbers than your
> expr_affy file, which only has ROUGHLY 35K VALUES (oh wait, I think I've
> seen this number before).
>
> Now the question is: *what* are these two files telling you?
> Have you looked at their outputs, aside from the number of lines in each
> file?
> Do you see probes by the same names in each file?

Yes. I looked at these two files. I randomly pick a "Transcript ID"
(for example 10571312) from the analysis output of the same set of CEL
files from a third party software. I found that number in
'gene_expr_affy.txt' but not in 'gene_expr_oligo.txt'. And all the IDs
in both files are numbers (no letters and underscore).

> What is a probe set? How is it different from a gene? Since you have many
> more probeset level numbers than gene/transcript numbers, does that mean
> there are more than 1 probeset to a gene? Is that what this is saying here?
> http://www.affymetrix.com/support/help/faqs/mouse_430/faq_8.jsp

The above webpage says "Probes in a gene family probe set (_a set) all
cross-hybridize to the same set of
sequences that belong to the same gene family (i.e. having same name in the
"geneCluster" column)." But I don't understand what the word
"sequence" means in this context? Does it mean a transcript?

> Or this figure (and the rest of the publication), here?:
> http://www.biomedcentral.com/1471-2105/8/108/figure/F1
>
> Has someone asked something like this on this mailing list before?
> https://stat.ethz.ch/pipermail/bioconductor/2009-January/025791.html

This thread does not provide very useful information.

> I actually have no idea what your output files look like (I haven't used
> oligo before, actually), but I'm just trying to help put some pieces
> together.

Thank you for collecting the links.

> In the meantime, I see that Sean has provided you a direct answer, so kudos,
> but I'd just suggest taking a bit more of this work on yourself. I think it
> will help to make more sense of the answers you will inevitably get on this
> list.

What does "kudos" mean?

Regards,
Peng



More information about the Bioconductor mailing list