[BioC] ...another question about using weights on microarray analysis

Thu Feb 19 11:45:20 CET 2009

Dear Matt,

thank you for your help.
My experimental design is a reference design with three classes and five 
array per class (a total of 15 arrays).
We do not have technical replicates, but each array refers to a biological 
copy compared with a pool of the five wild type (we use dual-color 
microarrays). One class is constituted from 5 copies of Hela G1 line cells 
transformed with a Wild type mutation of BRCA1. The second one is 
constituted from 5 copies of Hela G1 cell line transformed with a 
pathological mutation of BRCA1 and the last one  is constituted from 5 
copies of Hela G1 line cells transformed with another mutation of BRCA1. We 
are interested in comparing the two groups of transformet cells with the two 
types of mutation respet to Wild Type mutation.
We chose this design and this number of copies because the variance detected 
on a cell line is lower respert to that observed on animals (i.e. mice).
Are, in you opinion, 5 arrays per class enought to use arrayWeights()?

About weighting array spot-to-spot, what is your opinion?
I am not sure I have to consider as good, and then usefull to fit linear 
model, saturated spots. In all measurement process this kind of values are 
discarded because are outside the range of reliability of the instrument 
used to permorm the measurement.

Another observation...
If GenePix flags as NotFound a spot because the level of hybridization is 
not enought to consider reliable the level of hybridization, why should I 
use this signal to fit my model?

And concerning weighting spots before or after they have been normalized? 
The result change completely!

I am not much persuader about using unreliable spots and a lot confused 
about the workflow to be followed.

Could you, or anyone else, help me?

Thank you so much

Erika

----- Original Message ----- 
From: "Matt Ritchie" <mer36 at cam.ac.uk>
To: "Jenny Drnevich" <drnevich at illinois.edu>; "Erika Melissari" 
<erika.melissari at bioclinica.unipi.it>
Cc: <bioconductor at stat.math.ethz.ch>
Sent: Thursday, February 19, 2009 05:12 AM
Subject: Re: [BioC] ...another question about using weights on microarray 
analysis

> Dear Jenny and Erika,
>
> Regarding the question on array weights, so long as you have enough arrays 
> to fit the linear model to the means, you will be able to estimate array 
> variance factors using arrayWeights(). The example given on the 
> arrayWeights help page uses the methodology on a set of 6 arrays with 3 
> replicates per group.
>
> And yes, spot and array weights can be combined in the analysis by 
> multiplying them together. Make sure that what you are multiplying are 
> matrices of the same dimension though - the output of arrayWeights is a 
> vector, so you will want to run asMatrixWeights() on this vector before 
> multiplying with the spot weights.
>
> Best wishes,
>
> Matt
>
>>Hi Erika,
>>
>>Filtering spots on each array individually has been addressed several 
>>times on the list, and the general consensus is to only do it in very rare 
>>circumstances, such as when you have manually flagged spots that are 
>>scratches, dust spots, e.g., where the reported value has ABSOLUTELY NO 
>>RELATIONSHIP to whatever the real value might have been. Spots with low 
>>SNR, auto-flagged by GenePix as "missing", or saturated spots all have 
>>values that are approximations of what the real value is, even if they 
>>aren't as precise because they are outside the measurement abilities of 
>>the scan. As I tell my students - zeros are REAL data points - would you 
>>throw them out in other scientific measurements? NO. It is fine to throw 
>>out a spot that fails to meet your criteria on ALL arrays, like the 
>>control spots.
>>
>>I'm not sure about the array quality weights... the example uses 10 
>>replicates per group, which is probably a fine number to use to determine 
>>which arrays aren't as much alike as the others, but I'm not sure if it's 
>>good to use when you only have 3 replicates. Anyone care to comment on 
>>this?
>>
>>Cheers,
>>Jenny
>>
>>At 05:49 AM 2/17/2009, Erika Melissari wrote:
>>>Hello all,
>>>
>>>I have found discordant opinions among Bioconductor email regarding the 
>>>use of quality weights on microarray analysis and I woul like to 
>>>understand with clarity what to do before starting the statistical 
>>>analysis of my last experiment.
>>>I use LIMMA to perform statistical analysis of microarray experiments.
>>>Usually, I assign a weight to all the spots of my experiment by using in 
>>>read.maimages() this wt.fun:
>>>
>>>function(x, threshold=3){
>>>
>>>#to exclude spots with SNR<3 on both channels
>>>snrok <- !(x[,"SNR 635"] < threshold & x[,"SNR 532"] < threshold );
>>>
>>>#to include only genes and not control spots (I use Agilent microarrays)
>>>spotok <- (x[,"ControlType"] == "false");
>>>
>>>#to exclude spots with flag "bad" by GenePix Pro 6
>>>flagok <- (x[,"Flags"] >= 0);
>>>
>>>#to exclude spot saturated
>>>satok <- !(x[,"F635 % Sat."] > 10 | x[,"F532 % Sat."] > 10 );
>>>
>>>spot <- (snrok & spotok & flagok & satok);
>>>as.numeric (spot);
>>>}
>>>
>>>In my opinion it is right to exclude spot saturated (because its 
>>>intensity value is not reliable). Is it wrong?
>>>I have a doubt about excluding spot with low SNR, because in my last 
>>>experiment I should exclude for low SNR about 60% of 45000 spots and I am 
>>>worried about the robustness of statistical analysis evalued only on 40% 
>>>of the genes.
>>>Should I exclude this spot?
>>>Before or after normalization?
>>>Should I normalize all the spots and then, on the normalized value, apply 
>>>the SNR quality filter to exclude normalized spots with low SNR from 
>>>subsequent statistical analysis?
>>>I would like to use arrayWeights() from LIMMA and combine spot quality 
>>>weights and array quality weights. Is it right to multiply the spot 
>>>weight matrix by array quality vector?
>>>
>>>thank you very much for any help on this complicate question.
>>>
>>>Erika