[BioC] vsn2 and print-tips
Wolfgang Huber
huber at ebi.ac.uk
Tue Mar 4 17:52:57 CET 2008
Dear Hans-Ulrich,
thank you for your thoughtful message! The executive summary: this is
indeed an (unintended) difference between vsn and vsn2, and I will
update vsn2 before the next release. It only affects applications with
multiple strata (print-tip groups).
Bacgkround: the error and normalisation model of vsn is invariant under
an overall scaling of the data: if you multiply all intensities by a
factor of 10, you will get the same output - except for an overall shift
on the glog2 scale of log2(10). This makes sense because microarray data
don't have units and a value of "200" can mean very different things say
on an Affymetrix genechip and on a custom-made array.
This explains why there is this 'arbitrary' offset c. It is computed
through an explicite formula from the b's (i.e. the scale factors),
hence the fact whether your actual data contain instances of large x
does not directly matter (it may indirectly, by affecting how the b's
are estimated). For x -> infinity, the function glog2(f(b)*x+a)
approaches log2(x) + log2(f(b)) + log2(2), and c is computed to cancel
out the last two terms, so that for large x, the net transformation
resembles log2(x). There is one b for each array and stratum (=print tip
group). The current implementation of vsn2 computes one single value c
by taking the mean of log2(f(b)) + log2(2) across all strata and arrays.
The old vsn computed c from the b's of the first array only, but
separately for each stratum.
I had not anticipated that the difference between strata could make such
a difference, but given your observations, and with more thought about
it, it does make sense. I will update vsn2 to compute c from averaging
over the arrays, but separately for each stratum.
Best wishes
Wolfgang
------------------------------------------------------------------
Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber
04/03/2008 16:38 Hans-Ulrich Klein scripsit
> Dear all,
>
> I use the vsn2 method to normalize single-colour arrays with 48
> print-tips (25*26 oligos per print-tip). After normalization, the
> intensities of the 48 print-tips are in different ranges. The grid of
> the print-tips can be seen clearly on false color representations of the
> arrays' spatial distributions of feature intensities. However, scale and
> location of the intensities of a print-tip do not change across arrays.
>
> The man page of vsn2 says:
> "The data are returned on a glog scale to base 2. More precisely,
> the transformed data are subject to the transformation
> glog2(f(b)*x+a) + c, where glog2(u) = log2(u+sqrt(u*u+1)) =
> asinh(u)/log(2) is called the generalised logarithm, a and b are
> the fitted model parameters (see references), f is a parameter
> transformation [4], and the overall constant offset c is computed
> from b such that for large x the transformation approximately
> corresponds to the log2 function."
>
> May be there are not enough "large x" in some print-tips due to missing
> values in my data. I observed that reducing the number of oligos leads
> to even larger differences in the print-tip offsets. Are there
> parameters to take influence on the computation of c? Has someone else
> observed this problem? The older "vsn" function does not lead to
> different print-tip offsets.
>
> Regards,
> Hans-Ulrich
>
>
>
>
> > sessionInfo()
> R version 2.6.2 (2008-02-08)
> x86_64-pc-linux-gnu
>
> locale:
> C
>
> attached base packages:
> [1] tools stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] vsn_3.2.1 limma_2.12.0 affy_1.16.0
> [4] preprocessCore_1.0.0 affyio_1.6.1 Biobase_1.16.3
>
> loaded via a namespace (and not attached):
> [1] grid_2.6.2 lattice_0.17-4 rcompgen_0.1-17
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list