[BioC] Adding chips to an existing set of normalised data

Rafael A. Irizarry ririzarr at jhsph.edu
Wed Jun 4 11:53:27 MEST 2003


if your data is decnent what you describe wont be that big an issue, 
but here are various statergies to solve the problem you describe:

0- keep your cel files and redo everything every time (con: not efficient 
at all)
1- do rma on probe level. then before any expression level analysis 
normalize the merged exprsets. (con: you may over-normalize)
2- decide on a "tyical probe level distribution" and alway map to that 
(con: requires choice of a distribution and some extra coding)
3- use a non-multi array rma (ra?). you bg correct, use a non 
multichip normalization such as rescaling (can vsn be made mono-chip?) 
use robust summary, e.g. median, tukey.biweight, etc...  
(con: under my defition of a good expression measure: it wont be as good 
as rma but itll be better than mas 5.0) 
to see how well this does you can put it through 
affycomp.biostat.jhsph.edu

i would rank these stratergies: 2,1,3,0. to pick a 
typical probe level distribution in strategy 2 i 
would use as many arrays as possible. i would not use a parametric 
distribution, such as normal, just for computational convinience.


On Wed, 4 Jun 2003, 
Crispin Miller wrote:

> Hi!
> Over the last few days we've been learning lots about alternate ways of dealing with low-intesity probesets and some pretty strong arguments in favour of using alternate techniques to deal with these. Firstly, thanks - the discussion has been really helpful and much appreciated! 
> 
> These have now sparked a different question for us:
> We have an ever-increasing database of affymetrix chips... Currently these have been processed and normalised using MAS5.0. As we add arrays to the set, we can compare between them since the normalisation simply sets them to have the same average intensity. 
> 
> So the question is, if I am to normalise my data with, RMA say, I get a set of normalised arrays based on statistics generated over the set of chips I normalise - i.e. each array is normalised in the context of its peers, unlike MAS5.0 (as I understand it). This is, I think, due to the a(j) parameter in  the RMA model, or phi(j) for dChip which represent the probe affinity effects and can be estimated if we have 'enough arrays' (from Irizarray et al. 2003, NA Res paper).
> 
> Now, when we add experiments to the database, are the normalised expression levels calculated for one experimental chip-set comparable to the expression-levels computed for another. if not, do I need to apply RMA over the entire database each time I add a new experiment to it? And is this possible in a reasonable amount of time and memory? If not do people have alternate suggestions? We are particualrly interested in clustering and generation of expression profiles...
> 
> Crispin
> http://bioinf.picr.man.ac.uk/mbcf/microarray_ma.shtml
>  
> --------------------------------------------------------
> 
>  
> This email is confidential and intended solely for the use of th... {{dropped}}
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>



More information about the Bioconductor mailing list