[BioC] RMA in Bioconductor versus APT - missing probesets

Vincent Carey stvjc at channing.harvard.edu
Thu Mar 3 16:40:35 CET 2011


Consider the oligo stack.

> dir()[c(63,65,67)]
[1] "TisMix_mix8_03_v1_WTGene1.CEL" "TisMix_mix9_01_v1_WTGene1.CEL"
[3] "TisMix_mix9_02_v1_WTGene1.CEL"
> xo = oligo::read.celfiles(filenames=dir()[c(63,65,67)])
Platform design info loaded.
> xa = affy::ReadAffy(filenames=dir()[c(63,65,67)])
> dim(exprs(xo))
[1] 1102500       3
> dim(exprs(xa))
[1] 1102500       3
> rxo = rma(xo)
Error in function (classes, fdef, mtable)  :
  unable to find an inherited method for function "probeNames", for
signature "GeneFeatureSet"
> rxo = oligo::rma(xo)
Background correcting... OK
Normalizing... OK
Summarizing... OK
> rxa = affy::rma(xa)
Background correcting
Normalizing
Calculating Expression
> dim(exprs(rxo))
[1] 33297     3
> dim(exprs(rxa))
[1] 32321     3
> sessionInfo()
R version 2.13.0 Under development (unstable) (2011-03-01 r54628)
Platform: x86_64-apple-darwin10.4.0/x86_64 (64-bit)

locale:
[1] C

attached base packages:
[1] tools     stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
 [1] hugene10stv1cdf_2.7.1     affyio_1.19.2
 [3] affy_1.29.1               pd.hugene.1.0.st.v1_3.0.2
 [5] RSQLite_0.9-4             DBI_0.2-5
 [7] ff_2.2-1                  bit_1.1-6
 [9] oligo_1.15.1              oligoClasses_1.13.8
[11] Biobase_2.11.9

loaded via a namespace (and not attached):
[1] AnnotationDbi_1.13.13 Biostrings_2.19.11    IRanges_1.9.25
[4] affxparser_1.23.0     preprocessCore_1.13.3 splines_2.13.0


2011/3/3 Michal Blazejczyk <michal.blazejczyk at mail.mcgill.ca>:
> Dear Mark,
>
> Thank you for your answer.
>
> Please correct me if I'm getting the wrong impression, but doesn't this mean
> that just.rma() and rma() are simply wrong in this case?  And if that's the case
> then should they be used for ST data?  In previous versions of Biocionductor they
> simply did not work (there was no cdf environment) but now that they do users will
> be using them, generating results that are not complete...
>
> Best,
> Michał
>
>
>
> Mark Cowley <m.cowley at garvan.org.au> wrote:
>> Michal,
>> in just.rma and rma, it was assumed that each probe could be in at most 1
>> probeset. once a probe was used, it cannot be reused.
>> on the ST arrays, some probes can be in many probesets... so if you use rma,
>> eventually, all the probes in a probeset have been used once by the time the
>> current probeset needs it & you get NA's.
>
>> Mark
>
>> On 24/02/2011, at 8:40 AM, Michal Blazejczyk wrote:
>
>>> Dear Christian,
>>>
>>> I am aware of the existence of xps.  However, we can't use it for our purposes,
>>> largely because it is too complicated to set up (or at least, that was the case
>>> the last time we looked at it).  I would still like to know what's happening in
>>> just.rma()  :)
>>>
>>> Best,
>>> Michał
>>>
>>>
>>>
>>> cstrato <cstrato at aon.at> wrote:
>>>> Dear Michal,
>>>
>>>> As an alternative to just.rma() you could use the Bioconductor package
>>>> xps which uses the Affymetrix PGF-file as well as the Affymetrix
>>>> annotations, and thus should contain all probesets. xps has also a
>>>> vignette, "APTvsXPS.pdf" which compares the results for RMA obtained
>>>> from APT vs xps, respectively, for the HuGene 1.0 ST array.
>>>
>>>> Best regards
>>>> Christian
>>>> _._._._._._._._._._._._._._._._._._
>>>> C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
>>>> V.i.e.n.n.a           A.u.s.t.r.i.a
>>>> e.m.a.i.l:        cstrato at aon.at
>>>> _._._._._._._._._._._._._._._._._._
>>>
>>>
>>>> On 2/23/11 7:06 PM, Michal Blazejczyk wrote:
>>>>> Dear group,
>>>>>
>>>>> I have noticed that Bioconductor's just.rma() function returns fewer transcript-level
>>>>> probesets that RMA in APT for the Human Gene 1.0 ST array.  To be specific, 819 probesets
>>>>> are missing, and most of them seem to be "real", i.e. they are annotated when I run them
>>>>> through NetAffx.
>>>>>
>>>>> I would like to know why this is happening, and whether it is to be expected or maybe
>>>>> it is a bug.
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Michał Błażejczyk
>>>>> FlexArray Lead Developer
>>>>> McGill University and Genome Quebec Innovation Centre
>>>>> http://www.gqinnovationcenter.com/services/bioinformatics/flexarray/index.aspx?l=e
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list