[BioC] difference between rmaPLM and default fitPLM?

Thu Aug 14 22:45:33 CEST 2008

Interesting... I'll look into expresso. Thanks Richard.

Also, thanks Ben for your explanations.

Jenny

At 02:21 AM 8/14/2008, Richard Pearson wrote:
>I understand it is possible to get SEs from RMA by using expresso:
>
>library(affy)
>data(affybatch.example)
>rmaExpressionSet <- rma(affybatch.example)
>expressoRmaExpressionSet <- expresso(affybatch.example, 
>bgcorrect.method="rma",
>normalize.method="quantiles", pmcorrect.method="pmonly",
>summary.method="medianpolish")
>all.equal(exprs(rmaExpressionSet), exprs(expressoRmaExpressionSet))
>mySEs <- assayDataElement(expressoRmaExpressionSet, "se.exprs")
>
>I'll leave the question of whether the SEs created using this are any good or
>not to someone else
>
>Best wishes
>
>Richard.
>
>
>Ben Bolstad wrote:
>>In answer to your questions:
>>1. They are not intended to return the exact same numerical values.
>>There is probably details in one of the affyPLM that explains what
>>fitPLM does in general for fitting models. In particular, see appendix A
>>in:
>>http://www.bioconductor.org/packages/2.2/bioc/vignettes/affyPLM/inst/doc/AffyExtensions.pdf
>>more specifically it is fitting the model
>>y_ij = probe_i + chip_j + e_ij
>>for each probeset. This is the same model that RMA fits using the median
>>polish. Different fitting algorithms lead to numerically different
>>parameter estimates, though I would be surprised if there was any
>>significant difference in performance between the two.
>>All rmaPLM is meant to do is the standard RMA computation but return it
>>in a PLMset (meaning the model residuals also get returned in the
>>appropriate slot of the PLMset). It does not, as you have observed,
>>return SE estimates.
>>2. If you want the se.chip.coefs that correspond directly to the
>>expression values you get from rma() then you out of luck. You could do
>>what most people do, which is assume that for all intents and purposes
>>the values you get from fitPLM are equivalent to RMA (though since RMA
>>is a bit of a "brand name", they are not really quite the same as you
>>have seen, I don't make that exact claim).
>>Why is there not SE for RMA? Basically, the main reason is that as far
>>as I am aware there is not any asymptotic theory for the median polish
>>leading to good SE. I suppose that I could have cooked up something
>>ad-hoc for RMA, but the PLM approach superceded it instead.
>>Best,
>>Ben
>>
>>On Wed, 2008-08-13 at 16:16 -0500, Jenny Drnevich wrote:
>>>Hi again,
>>>
>>>I've gone through the codes of fitPLM and rmaPLM, and I think the 
>>>difference between them is in the modelparam; rmaPLM instead uses 
>>>md.param, which has only two of the list items of modelparam (I 
>>>don't pretend to understand what they are). Each function also 
>>>calls a different .Call code, and the one rmaPLM uses outputs NaNs 
>>>for all the se.*.coefs. So, my questions remain:
>>>
>>>1. Are rmaPLM and fitPLM with defaults supposed to return the same 
>>>expression values? If so, why aren't they, and if not, what is 
>>>fitPLM calculating?
>>>
>>>2. Is there anyway to get the se.chip.coefs for RMA expression values?
>>>
>>>Thanks!
>>>Jenny
>>>
>>>At 01:18 PM 8/13/2008, Jenny Drnevich wrote:
>>>>HI,
>>>>
>>>>I'm getting a difference in the output of rmaPLM() and the 
>>>>default fitPLM(). I thought the default settings of fitPLM was 
>>>>supposed to be the RMA model, at least the vignette calls it a 
>>>>RMA-style PLM. Besides differences in the coefs/RMA values 
>>>>produced, rmaPLM is not outputting the se.chip.coefs slot, which 
>>>>is the standard error estimates for the chip coefficients that I 
>>>>want. See code and sessionInfo() below. I'm stepping through the 
>>>>code of each right now to figure out where the differences are, 
>>>>but I thought maybe someone could help me.
>>>>
>>>>Thanks,
>>>>Jenny
>>>>
>>>>>raw <- ReadAffy()
>>>>>raw
>>>>AffyBatch object
>>>>size of arrays=1002x1002 features (8 kb)
>>>>cdf=Mouse430_2 (45101 affyids)
>>>>number of samples=4
>>>>number of genes=45101
>>>>annotation=mouse4302
>>>>notes=
>>>>
>>>>>rma.plm1 <- fitPLM(raw)
>>>>>rma.plm2 <- rmaPLM(raw)
>>>>>coefs(rma.plm1)[1:5,]
>>>>                  ctrl      ipi     Dmp1 ipi+Dmp1
>>>>1415670_at   11.02547 11.94682 11.07018 11.97461
>>>>1415671_at   12.33057 11.79736 12.25891 11.85637
>>>>1415672_at   11.38321 11.27387 11.36435 11.27967
>>>>1415673_at   10.03639 10.30681 10.02795 10.50375
>>>>1415674_a_at 10.45065 10.39834 10.53324 10.41134
>>>>>coefs(rma.plm2)[1:5,]
>>>>                   ctrl      ipi      Dmp1 ipi+Dmp1
>>>>1415670_at   10.710135 11.64534 10.751418 11.64572
>>>>1415671_at   12.414390 11.86388 12.337369 11.92163
>>>>1415672_at   11.584338 11.46070 11.549982 11.51548
>>>>1415673_at    9.902446 10.20577  9.933505 10.41233
>>>>1415674_a_at 10.228274 10.13993 10.301835 10.18136
>>>>>rma.rma <- rma(raw)
>>>>Background correcting
>>>>Normalizing
>>>>Calculating Expression
>>>>>exprs(rma.rma)[1:5,]
>>>>                   ctrl      ipi      Dmp1 ipi+Dmp1
>>>>1415670_at   10.710135 11.64534 10.751418 11.64572
>>>>1415671_at   12.414390 11.86388 12.337369 11.92163
>>>>1415672_at   11.584338 11.46070 11.549982 11.51548
>>>>1415673_at    9.902446 10.20577  9.933505 10.41233
>>>>1415674_a_at 10.228274 10.13993 10.301835 10.18136
>>>>>se(rma.plm1)[1:5,]
>>>>                    ctrl        ipi       Dmp1   ipi+Dmp1
>>>>1415670_at   0.02452685 0.02395422 0.02431190 0.02582995
>>>>1415671_at   0.02061485 0.02152285 0.02013362 0.02067196
>>>>1415672_at   0.03053776 0.02985205 0.03011108 0.03238839
>>>>1415673_at   0.04038712 0.03987544 0.04036288 0.03978057
>>>>1415674_a_at 0.02810474 0.02876715 0.02745431 0.02860643
>>>>>se(rma.plm2)[1:5,]
>>>>              ctrl ipi Dmp1 ipi+Dmp1
>>>>1415670_at    NaN NaN  NaN      NaN
>>>>1415671_at    NaN NaN  NaN      NaN
>>>>1415672_at    NaN NaN  NaN      NaN
>>>>1415673_at    NaN NaN  NaN      NaN
>>>>1415674_a_at  NaN NaN  NaN      NaN
>>>>>sessionInfo()
>>>>R version 2.7.1 (2008-06-23)
>>>>i386-pc-mingw32
>>>>
>>>>locale:
>>>>LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
>>>>States.1252;LC_MONETARY=English_United 
>>>>States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>>>
>>>>attached base packages:
>>>>[1] splines   tools     stats     graphics  grDevices utils     datasets
>>>>[8] methods   base
>>>>
>>>>other attached packages:
>>>>  [1] mouse4302cdf_2.2.0   affyQCReport_1.18.0  geneplotter_1.18.0
>>>>  [4] lattice_0.17-8       RColorBrewer_1.0-2   simpleaffy_2.16.0
>>>>  [7] made4_1.14.0         ade4_1.4-9           affyPLM_1.16.0
>>>>[10] affycoretools_1.12.0 annaffy_1.12.1       KEGG.db_2.2.0
>>>>[13] gcrma_2.12.1         matchprobes_1.12.0   biomaRt_1.14.0
>>>>[16] RCurl_0.9-3          GOstats_2.6.0        Category_2.6.0
>>>>[19] genefilter_1.20.0    survival_2.34-1      RBGL_1.16.0
>>>>[22] annotate_1.18.0      xtable_1.5-2         GO.db_2.2.0
>>>>[25] AnnotationDbi_1.2.2  RSQLite_0.6-9        DBI_0.2-4
>>>>[28] graph_1.18.1         limma_2.14.5         affy_1.18.2
>>>>[31] preprocessCore_1.2.0 affyio_1.8.0         Biobase_2.0.1
>>>>[34] RWinEdt_1.8-0
>>>>
>>>>loaded via a namespace (and not attached):
>>>>[1] cluster_1.11.11    grid_2.7.1         KernSmooth_2.22-22 XML_1.94-0.1
>>>>
>>>>
>>>>
>>>>Jenny Drnevich, Ph.D.
>>>>
>>>>Functional Genomics Bioinformatics Specialist
>>>>W.M. Keck Center for Comparative and Functional Genomics
>>>>Roy J. Carver Biotechnology Center
>>>>University of Illinois, Urbana-Champaign
>>>>
>>>>330 ERML
>>>>1201 W. Gregory Dr.
>>>>Urbana, IL 61801
>>>>USA
>>>>
>>>>ph: 217-244-7355
>>>>fax: 217-265-5066
>>>>e-mail: drnevich at illinois.edu
>>>>
>>>>_______________________________________________
>>>>Bioconductor mailing list
>>>>Bioconductor at stat.math.ethz.ch
>>>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>Search the archives: 
>>>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>Jenny Drnevich, Ph.D.
>>>
>>>Functional Genomics Bioinformatics Specialist
>>>W.M. Keck Center for Comparative and Functional Genomics
>>>Roy J. Carver Biotechnology Center
>>>University of Illinois, Urbana-Champaign
>>>
>>>330 ERML
>>>1201 W. Gregory Dr.
>>>Urbana, IL 61801
>>>USA
>>>
>>>ph: 217-244-7355
>>>fax: 217-265-5066
>>>e-mail: drnevich at illinois.edu
>>>
>>>_______________________________________________
>>>Bioconductor mailing list
>>>Bioconductor at stat.math.ethz.ch
>>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>Search the archives: 
>>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>>_______________________________________________
>>Bioconductor mailing list
>>Bioconductor at stat.math.ethz.ch
>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>Search the archives: 
>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>--
>Richard D. Pearson             richard.pearson at postgrad.manchester.ac.uk
>School of Computer Science,    http://www.cs.man.ac.uk/~pearsonr
>University of Manchester,      Tel: +44 161 275 6178
>Oxford Road,                   Mob: +44 7971 221181
>Manchester M13 9PL, UK.        Fax: +44 161 275 6204