[BioC] problem with expresso()

Wolfgang Huber w.huber@dkfz-heidelberg.de
Tue, 14 Jan 2003 13:29:30 +0100


Hi,

Oliver and I discussed this offline last Friday. The reason for the
confusion seems to be that the summary method "medianpolish" takes the
logarithm of the data, while, for example, "avdiff" does not. However, the
normalization and data transformation method "vsn" also implies a data
transformation that is like the logarithm. Thus, a call like

normalize.AffyBatch.methods <- c(normalize.AffyBatch.methods, "vsn")
es = expresso(data,
         pmcorrect.method = "pmonly",
         bgcorrect.method = "none",
         normalize.method = "vsn",
         summary.method   = "medianpolish")

will effectively take the logarithm of the intensities TWICE. The same call
with summary.method   = "avdiff" would, however, produce the right result.
Not sure how to best resolve this? I could "re-exponentiate" the data
returned by "vsn" in normalize.AffyBatch.vsn, such that the subsequent
log-transformation done in the summary.method would produce consistent
results.

However, here is a question regarding the general architecture of the affy
package: where is the right place to take the log-transformation? In the
"normalization"? In the "summary.method"? As an extra module? (Since some
people, including myself, may argue that log-transformation is not the only
thing one can do with microarray data?)

Opinions?

Best regards
Wolfgang

Division of Molecular Genome Analysis (Poustka Lab)
German Cancer Research Center (DKFZ)
Im Neuenheimer Feld 580
69120 Heidelberg, Germany

w.huber@dkfz.de
http://www.dkfz.de/abt0840/whuber
Tel +49-6221-424709
Fax +49-6221-42524709


-----Original Message-----
From: bioconductor-admin@stat.math.ethz.ch
[mailto:bioconductor-admin@stat.math.ethz.ch]On Behalf Of Oliver
Hartmann
Sent: Thursday, January 09, 2003 2:47 PM
To: bioconductor
Subject: [BioC] problem with expresso()


Dear lsit memners,

I am trying to find a way of normalzing affy chips with vsn (I found a
data set where rma() doesn't do well together with the t-statistic and I
was hopeing that vsn() could fix that). I used the following script:

data <- ReadAffy()
With this, identifying differentially expressed genes works fine
(results are very similar to rma() - see my tech report for details if
you like).
But there seems to be one problem: the intensities and the values \delta
h for differential expression (equivalent to the difference between the
log-ratios if using rma()) are both on the wrong scale. Well, as rma()
and other methods use log-transformed data, but vsn() uses a different
tranformation, I think using expresso() to calculat vsn-normalized
measures seems to log- AND arcsin-transform the data. Is there a way
around that? From the description I didn't find a way around
log-transformation nor where exactly the log-transformation was taking
place.

If you are interested in the comparission of the performance of rma(),
vsn() and MAS() tested on affymetrix data with spike in genes you can
find a tech report at http://staff-www.uni-marburg.de/~hartmann/ - but
only very preliminary work, sorry.

Thanks a lot

	-oliver hartmann-

--
Oliver Hartmann, Institute of Medical Biometry and Epidemiology
Philipps-University Marburg, Bunsenstr. 3, D-35037 Marburg
phone +49(0)6421 28 66514, fax +49(0)6421 28 68921

_______________________________________________
Bioconductor mailing list
Bioconductor@stat.math.ethz.ch
http://www.stat.math.ethz.ch/mailman/listinfo/bioconductor