[BioC] One-Color Agilent miRNA microarrays

Wolfgang Huber huber at ebi.ac.uk
Tue Jul 31 20:22:21 CEST 2007


Hi Pedro & Francesco,


> Don�t you think that (for the miRNA data) the quantile method might have the
> same problem as the vsn method has, i.e, the assumption that the majority of
> the genes do not show differential expression among experimental conditions?
> I think that using vsn or quantile method for normalizing between miRNA
> chips could introduce strong artifacts in the data. 

I agree, and would like to add two points:

- The assumption of most genes not differentially expressed is used in
vsn to derive the ML-estimator, but this does not automatically and
always imply that the results are invalid if the assumption does not
hold. Just that you have no guarantee and need to check / verify by
other means. [And for those who use vsn2, please check the parameter
"lts.quantile", its default is quite high and in some cases setting it
lower helps in such cases.]

- For quantile normalisation, a different assumption is made: that the
overall distribution of expression does not change. Depending on the
biology, I have seen cases where this is more or less appropriate than
the most genes not differentially expressed assumption.

Ideally, you can use spike-in controls and fit the vsn-model only on
these, then apply it to all data (see e.g. vsn vignette). This would
address your problem that so many microRNAs are expected to change.

And sometimes, in really variable cases, and if the experiments were
done carefully, doing no normalisation at all, except for (g)log
transformation, may be the least bad.

 best wishes
 Wolfgang


------------------------------------------------------------------
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber



> Hi Pedro,
> 
> Yes you're right.
> 
> I used vsn just cause it transform directly in log2. But I should do with
> method="quantile" and than use 
> MA<-log2(MA)
> 
> or better:
> 
> MA <- normalizeBetweenArrays(log2( Gbg.sort$G), method="quantile")
> 
> I've just try it and the result are slightly different, apparently it's
> better. And in fact in the case of miRNA is good to not make assumptions
> about the expression of the majority of the genes. we have 490 x 20 miRNA
> plus few hundreds of controls, is really different from the situation of a
> microarray for gene expression. 
> 
> Thanks for the correction.
> 
> Francesco
> 
>  
> 
> Hi Francesco,
> 
> This is not an answer to what you've asked but an additional question. Do
> you think that it's correct to apply a vsn normalization for a miRNA chip?.
> Vsn assumes that the majority of the genes are not differetially expressed, 
> and this might not be the case for a microRNA chip.
> 
> Thanks
> 
> Pedro.
> 
> -----Mensaje original-----
> De: HYPERLINK
> "mailto:bioconductor-bounces at stat.math.ethz.ch"bioconductor-bounces at stat.mat
> h.ethz.ch 
> [mailto:HYPERLINK
> "mailto:bioconductor-bounces at stat.math.ethz.ch"bioconductor-bounces at stat.mat
> h.ethz.ch] En nombre de Francesco
> Favero
> Enviado el: lunes, 30 de julio de 2007 12:39
> Para: HYPERLINK
> "mailto:bioconductor at stat.math.ethz.ch"bioconductor at stat.math.ethz.ch
> Asunto: [BioC] One-Color Agilent miRNA microarrays
> 
> Dear all,
> 
> I'm working with new microRNA one-color microarray from Agilent.
> I decided to use limma, and in this particular case, I needed to perform a 
> time-course experiment.
> 
> Thanks to this list I've done quite everything, but I'm not sure it's
> perfectly right...
> 
> I've imported the chips thanks to Peter White and Dr. Gordon Smyth:
> 
> HYPERLINK
> "https://stat.ethz.ch/pipermail/bioconductor/2007-May/017203.html"https://st
> at.ethz.ch/pipermail/bioconductor/2007-May/017203.html
> 
> So I have a G data with green and dummy red intensity.
> 
> The normalisation:
> 
> Gbg<-backgroundCorrect(G, method="subtract")
> 
> This array have 20 spots for each microRNA but they don't have an order in
> the chip, so I putted them in order for GeneName, so I'll able to use the 
> ndups argument:
> 
> spottypes<-readSpotTypes()
>> G$genes$Status<-controlStatus(spottypes,G)
>> Gbg.subset <- Gbg[Gbg$genes$Status == "Gene",]
>> Gbg.sort <- Gbg.subset[order(Gbg.subset$genes [,"GeneName"]),]
>>
> 
> Again normalisation between array just the green channel:
> 
> MA <- normalizeBetweenArrays(Gbg.sort$G, method="vsn")
> 
> And performed a normal time course experiment as from the limma manual: 
> 
> lev <- c("15", "37", "97", "167", "618")
>> f <- factor(targets$Cy3, levels=lev)
>> design <- model.matrix(~0+f)
>> colnames(design) <- levdupcor <- duplicateCorrelation(MA,design,ndups=20, 
>> spacing=1)
>>
> 
> now I have to fit everything in the linear model... but lmFit doesn't work.
> It complains an error on chol(V).
> 
> fit <- lmFit(MA,design,ndups=20,spacing=1,correlation=dupcor$consensus) 
>> Errore in chol(V) : il minore principale di ordine 2 non h definito
>> positivo
> 
> (Sorry for the error in Italian...anyway...)
> It works if I don't use ndups, but I need this...
> 
> I had a look in the lmFit function and it turned out that chol(V) is an 
> argument of the gls.series function. And I have the same error if I try to
> run gls.series to do my "fit" file.
> 
> the only way I managed to do was like this:
> 
> fit <- lm.series(MA, design = design, ndups = 20, spacing = 1,weights = 
>> Gbg.sort$weights)
>> fit$genes <- uniquegenelist(Gbg.sort$genes, ndups = 20, spacing = 1)
>> fit$design <- design
>> fit$Ameans <- rowMeans (unwrapdups(MA, ndups=20,spacing=1),na.rm=TRUE)
>> fit<-new ("MArrayLM", fit)
>>
> 
> So I've used lm.series instead of gls.series. I don't know if this is an
> error or not.
> I try to compare the two functions, they are similar, but I'm not sure they 
> do exactly the same thing for my case.
> 
> Anyway contrasts.fit for time course and eBayes works, and I obtain result
> that I'm going to validate with other tools.
> 
> Corrections and suggestions are really welcome. 
> 
> Thanks for developing an environment like Bioconductor and for all the
> support.
> 
> Francesco
> --
> 
> Cancer Genomics Lab.
> "Edo e Elvo Tempia" Foundation.
> Via Malta 3 13900 Biella
> Tel +39 015351830 
> HYPERLINK
> "http://www.fondoedotempia.it/sub_lab.php"http://www.fondoedotempia.it/sub_l
> ab.php
> 
>        
>



More information about the Bioconductor mailing list