[BioC] difference between rmaPLM and default fitPLM?

Thu Aug 14 09:21:02 CEST 2008

I understand it is possible to get SEs from RMA by using expresso:

library(affy)
data(affybatch.example)
rmaExpressionSet <- rma(affybatch.example)
expressoRmaExpressionSet <- expresso(affybatch.example, bgcorrect.method="rma",
normalize.method="quantiles", pmcorrect.method="pmonly",
summary.method="medianpolish")
all.equal(exprs(rmaExpressionSet), exprs(expressoRmaExpressionSet))
mySEs <- assayDataElement(expressoRmaExpressionSet, "se.exprs")

I'll leave the question of whether the SEs created using this are any good or
not to someone else

Best wishes

Richard.

Ben Bolstad wrote:
> In answer to your questions:
> 
> 1. They are not intended to return the exact same numerical values.
> There is probably details in one of the affyPLM that explains what
> fitPLM does in general for fitting models. In particular, see appendix A
> in:
> 
> http://www.bioconductor.org/packages/2.2/bioc/vignettes/affyPLM/inst/doc/AffyExtensions.pdf
> 
> more specifically it is fitting the model
> 
> y_ij = probe_i + chip_j + e_ij
> 
> for each probeset. This is the same model that RMA fits using the median
> polish. Different fitting algorithms lead to numerically different
> parameter estimates, though I would be surprised if there was any
> significant difference in performance between the two.
> 
> All rmaPLM is meant to do is the standard RMA computation but return it
> in a PLMset (meaning the model residuals also get returned in the
> appropriate slot of the PLMset). It does not, as you have observed,
> return SE estimates.
> 
> 2. If you want the se.chip.coefs that correspond directly to the
> expression values you get from rma() then you out of luck. You could do
> what most people do, which is assume that for all intents and purposes
> the values you get from fitPLM are equivalent to RMA (though since RMA
> is a bit of a "brand name", they are not really quite the same as you
> have seen, I don't make that exact claim).
> 
> Why is there not SE for RMA? Basically, the main reason is that as far
> as I am aware there is not any asymptotic theory for the median polish
> leading to good SE. I suppose that I could have cooked up something
> ad-hoc for RMA, but the PLM approach superceded it instead.
> 
> Best,
> 
> Ben
> 
> 
> On Wed, 2008-08-13 at 16:16 -0500, Jenny Drnevich wrote:
>> Hi again,
>>
>> I've gone through the codes of fitPLM and rmaPLM, and I think the 
>> difference between them is in the modelparam; rmaPLM instead uses 
>> md.param, which has only two of the list items of modelparam (I don't 
>> pretend to understand what they are). Each function also calls a 
>> different .Call code, and the one rmaPLM uses outputs NaNs for all 
>> the se.*.coefs. So, my questions remain:
>>
>> 1. Are rmaPLM and fitPLM with defaults supposed to return the same 
>> expression values? If so, why aren't they, and if not, what is fitPLM 
>> calculating?
>>
>> 2. Is there anyway to get the se.chip.coefs for RMA expression values?
>>
>> Thanks!
>> Jenny
>>
>> At 01:18 PM 8/13/2008, Jenny Drnevich wrote:
>>> HI,
>>>
>>> I'm getting a difference in the output of rmaPLM() and the default 
>>> fitPLM(). I thought the default settings of fitPLM was supposed to 
>>> be the RMA model, at least the vignette calls it a RMA-style PLM. 
>>> Besides differences in the coefs/RMA values produced, rmaPLM is not 
>>> outputting the se.chip.coefs slot, which is the standard error 
>>> estimates for the chip coefficients that I want. See code and 
>>> sessionInfo() below. I'm stepping through the code of each right now 
>>> to figure out where the differences are, but I thought maybe someone 
>>> could help me.
>>>
>>> Thanks,
>>> Jenny
>>>
>>>> raw <- ReadAffy()
>>>> raw
>>> AffyBatch object
>>> size of arrays=1002x1002 features (8 kb)
>>> cdf=Mouse430_2 (45101 affyids)
>>> number of samples=4
>>> number of genes=45101
>>> annotation=mouse4302
>>> notes=
>>>
>>>> rma.plm1 <- fitPLM(raw)
>>>> rma.plm2 <- rmaPLM(raw)
>>>> coefs(rma.plm1)[1:5,]
>>>                  ctrl      ipi     Dmp1 ipi+Dmp1
>>> 1415670_at   11.02547 11.94682 11.07018 11.97461
>>> 1415671_at   12.33057 11.79736 12.25891 11.85637
>>> 1415672_at   11.38321 11.27387 11.36435 11.27967
>>> 1415673_at   10.03639 10.30681 10.02795 10.50375
>>> 1415674_a_at 10.45065 10.39834 10.53324 10.41134
>>>> coefs(rma.plm2)[1:5,]
>>>                   ctrl      ipi      Dmp1 ipi+Dmp1
>>> 1415670_at   10.710135 11.64534 10.751418 11.64572
>>> 1415671_at   12.414390 11.86388 12.337369 11.92163
>>> 1415672_at   11.584338 11.46070 11.549982 11.51548
>>> 1415673_at    9.902446 10.20577  9.933505 10.41233
>>> 1415674_a_at 10.228274 10.13993 10.301835 10.18136
>>>> rma.rma <- rma(raw)
>>> Background correcting
>>> Normalizing
>>> Calculating Expression
>>>> exprs(rma.rma)[1:5,]
>>>                   ctrl      ipi      Dmp1 ipi+Dmp1
>>> 1415670_at   10.710135 11.64534 10.751418 11.64572
>>> 1415671_at   12.414390 11.86388 12.337369 11.92163
>>> 1415672_at   11.584338 11.46070 11.549982 11.51548
>>> 1415673_at    9.902446 10.20577  9.933505 10.41233
>>> 1415674_a_at 10.228274 10.13993 10.301835 10.18136
>>>> se(rma.plm1)[1:5,]
>>>                    ctrl        ipi       Dmp1   ipi+Dmp1
>>> 1415670_at   0.02452685 0.02395422 0.02431190 0.02582995
>>> 1415671_at   0.02061485 0.02152285 0.02013362 0.02067196
>>> 1415672_at   0.03053776 0.02985205 0.03011108 0.03238839
>>> 1415673_at   0.04038712 0.03987544 0.04036288 0.03978057
>>> 1415674_a_at 0.02810474 0.02876715 0.02745431 0.02860643
>>>> se(rma.plm2)[1:5,]
>>>              ctrl ipi Dmp1 ipi+Dmp1
>>> 1415670_at    NaN NaN  NaN      NaN
>>> 1415671_at    NaN NaN  NaN      NaN
>>> 1415672_at    NaN NaN  NaN      NaN
>>> 1415673_at    NaN NaN  NaN      NaN
>>> 1415674_a_at  NaN NaN  NaN      NaN
>>>> sessionInfo()
>>> R version 2.7.1 (2008-06-23)
>>> i386-pc-mingw32
>>>
>>> locale:
>>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
>>> States.1252;LC_MONETARY=English_United 
>>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>>
>>> attached base packages:
>>> [1] splines   tools     stats     graphics  grDevices utils     datasets
>>> [8] methods   base
>>>
>>> other attached packages:
>>>  [1] mouse4302cdf_2.2.0   affyQCReport_1.18.0  geneplotter_1.18.0
>>>  [4] lattice_0.17-8       RColorBrewer_1.0-2   simpleaffy_2.16.0
>>>  [7] made4_1.14.0         ade4_1.4-9           affyPLM_1.16.0
>>> [10] affycoretools_1.12.0 annaffy_1.12.1       KEGG.db_2.2.0
>>> [13] gcrma_2.12.1         matchprobes_1.12.0   biomaRt_1.14.0
>>> [16] RCurl_0.9-3          GOstats_2.6.0        Category_2.6.0
>>> [19] genefilter_1.20.0    survival_2.34-1      RBGL_1.16.0
>>> [22] annotate_1.18.0      xtable_1.5-2         GO.db_2.2.0
>>> [25] AnnotationDbi_1.2.2  RSQLite_0.6-9        DBI_0.2-4
>>> [28] graph_1.18.1         limma_2.14.5         affy_1.18.2
>>> [31] preprocessCore_1.2.0 affyio_1.8.0         Biobase_2.0.1
>>> [34] RWinEdt_1.8-0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] cluster_1.11.11    grid_2.7.1         KernSmooth_2.22-22 XML_1.94-0.1
>>>
>>>
>>>
>>> Jenny Drnevich, Ph.D.
>>>
>>> Functional Genomics Bioinformatics Specialist
>>> W.M. Keck Center for Comparative and Functional Genomics
>>> Roy J. Carver Biotechnology Center
>>> University of Illinois, Urbana-Champaign
>>>
>>> 330 ERML
>>> 1201 W. Gregory Dr.
>>> Urbana, IL 61801
>>> USA
>>>
>>> ph: 217-244-7355
>>> fax: 217-265-5066
>>> e-mail: drnevich at illinois.edu
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: 
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> Jenny Drnevich, Ph.D.
>>
>> Functional Genomics Bioinformatics Specialist
>> W.M. Keck Center for Comparative and Functional Genomics
>> Roy J. Carver Biotechnology Center
>> University of Illinois, Urbana-Champaign
>>
>> 330 ERML
>> 1201 W. Gregory Dr.
>> Urbana, IL 61801
>> USA
>>
>> ph: 217-244-7355
>> fax: 217-265-5066
>> e-mail: drnevich at illinois.edu
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Richard D. Pearson             richard.pearson at postgrad.manchester.ac.uk
School of Computer Science,    http://www.cs.man.ac.uk/~pearsonr
University of Manchester,      Tel: +44 161 275 6178
Oxford Road,                   Mob: +44 7971 221181
Manchester M13 9PL, UK.        Fax: +44 161 275 6204