[BioC] Normalized data in expresso and Expression Console differ

Oliver Stolpe oliver.stolpe at fu-berlin.de
Tue Jun 23 11:25:15 CEST 2009


Hello list,

currently I use the expresso method from the Bioconductor package to
analyze Affymetrix data:

normalized <- expresso(data, bgcorrect.method = "mas",
            normalize.method = "quantiles",
            pmcorrect.method = "mas",
            summary.method = "mas")
matrix <- log2(exprs(normalized))

As a reference I use the Expression Console by Affymetrix. My goal is
to rebuild the normalized data (and therefore the resulting boxplot)
from the Expression Console with R. I took the log2 after normalization
and correction since the Expression Console delivered relative small
values (seemed logarithmized) and the expresso data had really a big
range. Unfortunately the results differ.


Does anyone know why they differ that noticeable (different mean,
many outliers)? You may have a look at the boxplots I attached.


Even when I leave out the normalization in expresso it looks nearly
the same.

I'm glad about any suggestions.

Thanks in advance,
best regards,
Oliver

Some helpful data:

  > head(matrix_expresso)
   data1.cel.gz data2.cel.gz data3.cel.gz data4.cel.gz
       67.16587     72.66765     73.49201     74.00240
       72.03782     95.80303     97.60087     64.60356
      117.65746    142.88926    138.01063    159.64211
      185.33413    292.81031    232.82629    259.88629
      164.88572    260.95710    243.47892    247.80303
     1238.80516   1674.33256   1525.44652   1490.71100
   data5.cel.gz data6.cel.gz
        73.5097     67.97570
        93.9136     84.26307
       145.7278    124.94947
       250.9573    235.76545
       235.0867    251.55364
      1486.8813   1523.14721

  > head(matrix_expresso_log2)
   data1.cel.gz data2.cel.gz data3.cel.gz data4.cel.gz
       6.069657     6.183241     6.199515     6.209500
       6.170683     6.581999     6.608822     6.013542
       6.878449     7.158754     7.108636     7.318697
       7.533985     8.193823     7.863110     8.021737
       7.365323     8.027669     7.927653     7.953050
      10.274734    10.709370    10.575016    10.541785
   data5.cel.gz data6.cel.gz
       6.199863     6.086947
       6.553262     6.396829
       7.187132     6.965201
       7.971298     7.881209
       7.877049     7.974722
      10.538074    10.572840

  > sessionInfo()
R version 2.9.0 (2009-04-17)
i686-redhat-linux-gnu

locale:
LC_CTYPE=de_DE at euro;LC_NUMERIC=C;LC_TIME=de_DE at euro;LC_COLLATE=de_DE at euro;LC_MONETARY=C;LC_MESSAGES=de_DE at euro;LC_PAPER=de_DE at euro;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=de_DE at euro;LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
  [1] zebrafishcdf_2.4.0 marray_1.22.0      limma_2.18.0
RdbiPgSQL_1.18.1
  [5] Rdbi_1.18.0        multtest_2.0.0     class_7.2-47       MASS_7.2-47
  [9] affy_1.22.0        Biobase_2.4.1

loaded via a namespace (and not attached):
[1] affyio_1.12.0        preprocessCore_1.6.0 splines_2.9.0
[4] survival_2.35-4      tools_2.9.0


-------------- next part --------------
A non-text attachment was scrubbed...
Name: boxplot_expresso_log2.png
Type: image/png
Size: 5615 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20090623/55d897e4/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: boxplot_expression_console_anonym.png
Type: image/png
Size: 15068 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20090623/55d897e4/attachment-0001.png>


More information about the Bioconductor mailing list