[BioC] About weight function for Agilent data

Gordon K Smyth smyth at wehi.EDU.AU
Sat Nov 19 13:22:51 CET 2005


On Sat, November 19, 2005 12:51 am, Nataliya Yeremenko wrote:
> Gordon K Smyth wrote:
>
>>>Date: Thu, 17 Nov 2005 22:10:33 +0100
>>>From: Nataliya Yeremenko <eremenko at science.uva.nl>
>>>Subject: [BioC] About weight function for Agilent data
>>>To: Bioconductor List <bioconductor at stat.math.ethz.ch>
>>>
>>>I have seen here in the BioC topics several threads concerning weighting of
>>>the Agilent data, howeever I didn't understand how important it is,
>>>how does it influence linear models and differential expression testing.
>>>
>>>In particular Agilent Feature extraction performs quite a lot of
>>>flagging and do normalization itself.
>>>What kind of flags is important to set for weight zero?
>>>Should control spots be weighted zero as well?
>>>Is it wise to use processed intensities (and do not use withinarray
>>>normalisation of Limma) instead of raw data?
>>>There are number of between normalisations, but which one to use?
>>>
>>>--
>>>Dr. Nataliya Yeremenko
>>>
>>>Universiteit van Amsterdam
>>>Faculty of Science
>>>IBED/AMB (Aquatische Microbiologie)
>>>Nieuwe Achtergracht 127
>>>NL-1018WS Amsterdam
>>>the Netherlands
>>>
>>>tel. + 31 20 5257089
>>>fax  + 31 20 5257064
>>>
>>
> Thanks Gordon for explanation.
>
>>Control spots should be removed before the differential expression analysis.
>>
> As far as I understood control spots should be removed by assigning zero
> weight in read.maimages step
> (by creating wt.fun).
> Is it correct?

No.  Weights are for removing individual spots.  For removing whole probes you should use
subsetting operations.  The Weaver case study in the Limma User's Guide gives an example of this.

>>Apart from that, my experience is that most flags estimated by image analysis programs are best
>>ignored.  They tend to be very conservative and to encourage you to remove data which is actually
>>quite usuable in the context of a replicated experiment.  However this is not based on any
>> careful
>>analysis of AgilentFE's flags, so you may find differently.
>>
> There are 8 flags in FE:  they cover Feature and background
> non-uniformity and population outliers in each channel separately.
> Which one are important to down-weight before normalization?

I've already said that I would ignore all the flags.

> Concerning normalization step:
> As for within normalization Is seems that only "lowess" is suitable for
> Agilent arrays.
> How to estimate what is most reliable betweennormalization ?
> For example "Aquantile" produced plotddensities much more fitted to each
> other than  "vsn".
> Does it mean that "vsn" is not good in my case?

No it doesn't.  Aquantile is designed to make the densities equal while vsn is not, so this is not
a criterion of appropriateness.

It is not yet established which is the best between-array normalization method, perhaps it never
will.  I use Aquantile.  I expect that Aquantile, quantile and vsn are all good enough.

Note that vsn is an omnibus method.  It should not be combined with any other normalization or
background correction methods, including loess.

>>If you are using limma's lmscFit function, you do not have the option of weighting or flagging
>>spots anyway.
>>
> But I'm still using weighting for the normalization step.

Well, that's up to you.  I've already said that I would not do this.

> I have used splitting two-colour arrays into two single-channel ones,
> just only because appart from biological replicates I have as well
> some technical one, where biological replicates are mixed:
> array1:  A1 vs O1
> array2:  O2 vs A2
> array3:  A2 vs O3
> etc...
> (almost each one, but not all of them, are technicaly replicated by
> dye-swap as well)
> I understood now that this kind of design is far from ideal, but
> experiments are done.
> Is there still way to create design for that type of experiment and
> contrast A vs O?
> There is only one way to create such a contrast with use of all
> replicates is to perform singel channel fitting.
> Is that true?

I cannot understand this at all.  You have already posted a complete analysis of this experiment
to this list which seemed perfectly satisfactory:

https://www.stat.math.ethz.ch/pipermail/bioconductor/2005-November/010873.html

You seem to have abandoned the earlier much better analysis for something which doesn't make any
sense to me, for no reason that I can make out.  Why should you do this?

Gordon

> Regards
> Nataliya
>
>>Best wishes
>>Gordon
>>
>>
>>
> --
> Dr. Nataliya Yeremenko
>
> Universiteit van Amsterdam
> Faculty of Science
> IBED/AMB (Aquatische Microbiologie)
> Nieuwe Achtergracht 127
> NL-1018WS Amsterdam
> the Netherlands
>
> tel. + 31 20 5257089
> fax  + 31 20 5257064



More information about the Bioconductor mailing list