[BioC] Detection of differential expression using limma

Christian Eisen christianeisen at alice-dsl.de
Mon Oct 13 20:43:44 CEST 2008

Yes you are right, this is data from a Agilent miRNA microarray.
I have studied the Agilent protocol and the TotalGeneSignal is a processed
signal, background substracted and kind of normalized, they didn't 
specify it
what method they are using or I just didn't read it properly...

Well I am not using the processedSignal, I only loaded it in the red channel
because read.images from limma doesn't work otherwise. So this is
a way around. All subsequent steps like normalization etc are just 
performed on the
green channel.
I know that there is still debate about the proper normalization method 
for miRNA data.
There is a paper dealing with that perticular subject

Davidson T.S., Johnson C.D. and Andruss B.F. "Analyzing micro-RNA
expression using microarrays" Methods in Enzymology 411(1):14-34, 2006.

However they are suggesting to do a VSN normalization.
Previous studies used the median normalization method, however
I thought that median normalization is just performed for
"within-array normalization" and not for "between-array normalization".
I may as well be wrong, like I said I am not really familiar with the
topic of microarray processing.
Nevertheless I haven't found a method to process my data using median
normalization as all methods available use both channels, red and green.
But if anybody knows how to do median normalization on one-color data
I am happy to learn.

As far as I can interpret the data I think that VSN is not working that well
as it estimates a ML-estimator form the data available. And data point lower 
than the estimator get a negative value. So for the replicate spot of the genes
if one of them gets a lower value than the rest of the replicates, it ends up
getting a negative value and voilà...significance...even though the data
doesn't show it...
Wolfgang Huber suggested in another thread to do the VSN normalization
just for the spike-in controls and the use the estimators derived from that
on the rest of the data. But this seems to be a heavy task and I spent my
whole day on studying the vignettes and other data on google concerning this
topic but I just have no clue how to do it.

So if I can't figure that out I will probably stick to the log2 transformed data
as it shows at least reasonable hits in the results.

More information about the Bioconductor mailing list