[BioC] Inconsistency in RMA results from 'affy' and results from 'oligo'

Steve Lianoglou mailinglist.honeypot at gmail.com
Tue Aug 11 04:53:19 CEST 2009


Hi,

On Aug 10, 2009, at 9:30 PM, Peng Yu wrote:

> Hi Benilton,
>
> If both of scripts do not generate errors, does it mean that both of
> them are correct? But the RMA results are very different, so one of
> them must be wrong?

Benilton, I think, has provided some hints which you can chew on for a  
bit. Sometimes it helps to digest some of the information and try to  
*really* figure out what it *means* instead of hoping to jump straight  
to an answer. Let's see if we can't do that with the information we  
have here:

> On Mon, Aug 10, 2009 at 5:25 PM, Benilton Carvalho<bcarvalh at jhsph.edu>
> wrote:
>>
>> Dear Peng,
>>
>> I can speak for oligo and the annotation package used by it.
>>
>> The current release of oligo summarizes to the probeset level. The  
>> next
>> release of oligo and annotation packages will allow you to  
>> summarize to
>> the
>> gene level. In your particular case, the count you'll get is  
>> roughly 35K.

He mentions summaries at "the probeset level," and then "the gene  
level." He also mentions that the upcoming release of oligo WILL allow  
you to summarize to THE GENE LEVEL, which I guess suggests that  
currently oligo is providing summaries at the probeset level. Once you  
can get GENE LEVEL summaries, you will get ROUGHLY 35K VALUES.

Now, you say your files are very different, and that one (or both!) of  
these package must be wrong, since you've looked at this:

$ wc gene_expr_affy.txt gene_expr_oligo.txt
34761   312848  5002519 gene_expr_affy.txt
234591  2111318 33763075 gene_expr_oligo.txt
269352  2424166 38765594 total

It looks like your current oligo summary has many more numbers than  
your expr_affy file, which only has ROUGHLY 35K VALUES (oh wait, I  
think I've seen this number before).

Now the question is: *what* are these two files telling you?
Have you looked at their outputs, aside from the number of lines in  
each file?
Do you see probes by the same names in each file?

What is a probe set? How is it different from a gene? Since you have  
many more probeset level numbers than gene/transcript numbers, does  
that mean there are more than 1 probeset to a gene? Is that what this  
is saying here?
http://www.affymetrix.com/support/help/faqs/mouse_430/faq_8.jsp

Or this figure (and the rest of the publication), here?:
http://www.biomedcentral.com/1471-2105/8/108/figure/F1

Has someone asked something like this on this mailing list before?
https://stat.ethz.ch/pipermail/bioconductor/2009-January/025791.html

I actually have no idea what your output files look like (I haven't  
used oligo before, actually), but I'm just trying to help put some  
pieces together.

In the meantime, I see that Sean has provided you a direct answer, so  
kudos, but I'd just suggest taking a bit more of this work on  
yourself. I think it will help to make more sense of the answers you  
will inevitably get on this list.

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  | Memorial Sloan-Kettering Cancer Center
  | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list