[BioC] aroma.affymetrix and affy - different results

Henrik Bengtsson hb at stat.berkeley.edu
Tue Jul 8 09:50:20 CEST 2008


here are some more details on what Ken says:

The QuantileNormalization(cs, ...) method in aroma.affymetrix
normalizes each array separately toward a target distribution.

If not specified explicitly, the target distribution is calculated as
the average sort():ed and approx():ed intensities across all arrays in
the data set. The details can be found in averageQuantile() for the
AffymetrixCelSet class.  The algorithm is the same as
aroma.light::averageQuantile() for the 'list' class, which was adopted
from 'limma' in 2002 and 2006 (which to the best of my knowledge
originates from Ben Bolstad's first implementation).  See
print(aroma.light::averageQuantile.list) for details.

The normalization toward the target distribution is then done using
aroma.light::normalizeQuantile(), which also was adopted from 'limma'
(with the same ancestors etc).  See
print(aroma.light::normalizeQuantileRank.numeric) for details.

I haven't followed development of the limma version, but I'd say they
give very similar, if not all.equal(), results.

Finally, another reason for the observed differences may be
differences in precision.  The output from QuantileNormalization is
stored in CEL files, where probe intensities are stored as floats (4
bytes).  Normalized data held in memory are doubles (8 bytes), which
is the case for most Bioconductor packages.

FYI, one of the redundancy tests being run before every major release
of aroma.affymetrix, verifies that aroma.affymetrix
(RmaBackgroundCorrection + QuantileNormalization + RmaPlm) can
reproduce the RMA
chip-effect estimates as estimated by affyPLM (mean/sd squared
differences < 10^-4).  The test was adopted from the script that Mark
Robinson posted earlier.



On Mon, Jul 7, 2008 at 11:16 PM, ken.m.simpson at gmail.com
<ken.m.simpson at gmail.com> wrote:
> Hi Markus,
> Glad to hear you are getting (almost!) consistent results.  The
> remaining discrepancy is very small and probably not of concern.  It
> is no doubt due to slight differences in the implementation of
> quantile normalization in the two packages, but without delving into
> the code I can't give a definitive answer on that.
> Cheers,
> Ken
> On Jul 7, 5:43 pm, schmidi <schmidber... at gmx.at> wrote:
>> Hi Ken,
>> you are right. I removed the normalised signals in the probeData
>> directory and re-run:
>> Now I have these results:
>> > all.equal(exprs(affyBatch_bgc),exprs(affyBatch_aroma_bgc))
>> [1] "Attributes: < Component 2: Component 2: 10 string mismatches >"
>> [2] "Mean relative difference: 2.300415e-08"> all.equal(exprs(affyBatch_norm),exprs(affyBatch_aroma_norm))
>> [1] "Attributes: < Component 2: Component 2: 10 string mismatches >"
>> [2] "Mean relative difference: 0.000647665"
>> But there is still an error of dimension 10^(-4). This is very much
>> for two methods, which should create the same results. But the raw
>> data only hava an accuracy of 10^(-1), so I think this could be OK.
>> I also had a look to some quality plots (hist and boxplot), the plots
>> look very well.
>> Best
>> Markus
> --~--~---------~--~----~------------~-------~--~----~
> When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example.
> You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group.
> To post to this group, send email to aroma-affymetrix at googlegroups.com
> To unsubscribe from this group, send email to aroma-affymetrix-unsubscribe at googlegroups.com
> For more options, visit this group at http://groups.google.com/group/aroma-affymetrix?hl=en
> -~----------~----~----~----~------~----~------~--~---

More information about the Bioconductor mailing list