[BioC] Increase in CV of replicated spots after normalization?

Wolfgang Huber huber at ebi.ac.uk
Mon Jun 20 20:36:47 CEST 2005


Hi Claus,

thanks for pointing this out. This has slipped through since the 
standard deviation of log(x) is approximately equal to the CV of x, if 
the latter is not too large (this is seen from a first order expansion), 
so when I talked about "CV of replicates" I meant the standard deviation 
of their log-ratios.

However, in his mail Jakob refered to "CV of log-ratios", and you are 
absolutely right - these are not appropriate.

	Best wishes
	Wolfgang

Claus Mayer wrote:
> Hi Wolfgang and Jakob
> 
> I think there is some confusion here. The CV is (at least as far as I 
> know) standard deviation divided by mean, so it is scale-invariant, i.e 
> dividing all log-ratios by 2 shouldn't make a difference. It is not 
> location-invariant though, which could be the explanation for the 
> increased CV. The normalisation centers the log-ratio distribution, so 
> for most genes the mean should be closer to 0 than before, which will 
> result in an increased CV.
> For that reason the CV is not an appropriate tool here to assess the 
> effect of the normalisation. As Wolfgang points out, the distribution 
> of  F- or t-statistics (or the corresponding p-values)  should be a 
> reasonable (and scale-invariant!) exploratory tool to assess the sucess 
> of the normalisation.
> 
> Best Wishes
> 
> Claus
> 
> 
> Wolfgang Huber wrote:
> 
>> Hi Jakob,
>>
>> it can be misleading to look solely at the CV of replicates to assess 
>> normalization. Because if you did that, a normalization method that 
>> simply divided all your log-ratios by 2 would be twice as good, and 
>> one that sets everything to zero would be even better.
>>
>> What I usually do is look at the distribution of F- or t-statistics 
>> per gene across arrays for some meaningful biological grouping of the 
>> samples. There need to be enough replicate arrays within each group 
>> for this.
>>
>> Still, if you used a "reasonable" normalization method, it sounds it 
>> didn't work well on your data. It is hard to say more without more 
>> details on what you did and diagnostic plots etc.
>>
>> Best regards
>>  Wolfgang
>>
>>
>>
>>
>>
>> Jakob Hedegaard wrote:
>>  
>>
>>> Hi list
>>>
>>>
>>>
>>> I am working on a data set from 24 arrays, where each array consist of
>>> 6.912 spots replicated pair wise at two different spatial locations.
>>>
>>> For quality evaluation, I have calculated the CV of "raw" log-ratios for
>>> each pair wise replicated spot (13.824 points per array) and have
>>> observed the expected tendency of decreasing CV by increasing average
>>> spot intensity.
>>>
>>> When calculating the CV for normalized data, I have observed that the CV
>>> has increased compared to CV for raw data. This essentially means that
>>> normalization is making data worse in terms of variance among replicated
>>> spots!
>>>
>>>
>>>
>>> Has anybody observed something similar?
>>>
>>> Is this what should be expected or does it indicate that the
>>> normalization is not optimally performed?
>>>
>>>
>>>
>>> Looking forward hearing from you!
>>>
>>> Jakob
>>>
>>>   
> 
> 
> 


-- 
Best regards
   Wolfgang

-------------------------------------
Wolfgang Huber
European Bioinformatics Institute
European Molecular Biology Laboratory
Cambridge CB10 1SD
England
Phone: +44 1223 494642
Fax:   +44 1223 494486
Http:  www.ebi.ac.uk/huber



More information about the Bioconductor mailing list