[BioC] HT qPCR - error in scale rank invariant

Tue Jan 25 12:51:13 CET 2011

Dear Heidi,

thanks for your reply. Indeed I am comparing cell types which have huge
differences between miRNAs profiles and unfortunately the qPCR assay only
has one endogenous gene which is being affected by cell type and therefore
dCt method is not adequate. I have tried quantile. The reason why I wanted
to find another method is because has you can see in the distribution of Ct
values, the cells S1 have many miRs which are not expressed and that I am
analyzing as Ct=40. So these cells are very different and with quantile some
differences will not pop up in the analyses because I am forcing it to have
a distribution similar do the other cells. Still I think that is approach is
conservative, given that some differences do appear as you can see in the
files after quantile normalization. Implementing other methods that could
deal with this problems of working with cell types  which have different
behavior like my case and lacking endogenous genes to normalize could be a
suggestion to your package.
Kind regards,
Andreia

PS: in attach are two files with the correlations and data distribution.

On Mon, Jan 24, 2011 at 11:11 PM, Heidi Dvinge <heidi at ebi.ac.uk> wrote:

> Hello Andreia,
>
> I can reproduce the error you get if I say:
>
> > data(qPCRraw)
> > temp <- normalizeCtData(qPCRraw, norm="scale.rankinvariant")
> Scaling Ct values
>        Using rank invariant genes: Gene1 Gene29
>        Scaling factors: 1.00 1.06 1.00 1.03 1.00 1.00
> # Select just the first genes so that Gene29 is excluded
> > normalizeCtData(qPCRraw[1:10,], norm="scale.rankinvariant")
> Error in smooth.spline(ref[i.set], data[i.set]) :
>  need at least four unique 'x' values
>
> After looking into the code, the problem occur when there's only a single
> (or no) rank invariant genes between any individual sample and the
> reference sample (the mean or median across all samples). At least two
> rank-invariant genes are required between the reference and each sample.
> I'll make a note of this in the help file.
>
> This means that a rank-invariant method is not going to be robust enough
> for your normalisation. Instead, you'll have to go with ddCt or quantile.
> In the future there might be other options available in HTqPCR (e.g. scale
> by arithmetic or geometric mean) depending on demand.
>
> The likely cause of this is that your samples are quite different. Have
> you tried investigating them with e.g. plotCtCor or clusterCt to see if
> they group as expected, or if there's any marked difference in the
> distribution of Ct values (plotCtDensity)? Even a relatively harsh method
> such as quantile normalisation might be suitable for you data.
>
> Cheers
> \Heidi
>
>
> > Dear Heidi,
> >
> > thanks for the quick reply,
> >
> > after traceback() I get
> >
> > traceback()
> > 5: stop("need at least four unique 'x' values")
> > 4: smooth.spline(ref[i.set], data[i.set])
> > 3: FUN(newX[, i], ...)
> > 2: apply(data, 2, normalize.invariantset, ref = ref.data)
> > 1: normalizeCtData(raw.cat, norm = "scale.rank")
> >
> > information about the session
> > sessionInfo()
> > R version 2.11.1 (2010-05-31)
> > i386-apple-darwin9.8.0
> >
> > locale:
> > [1] C
> >
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >
> > other attached packages:
> > [1] statmod_1.4.8      HTqPCR_1.2.0       limma_3.4.4
> > RColorBrewer_1.0-2 Biobase_2.8.0
> >
> > loaded via a namespace (and not attached):
> > [1] affy_1.26.1           affyio_1.16.0         gdata_2.7.2
> > gplots_2.8.0          gtools_2.6.2          preprocessCore_1.10.0
> >
> > On Fri, Jan 21, 2011 at 5:41 PM, Heidi Dvinge <heidi at ebi.ac.uk> wrote:
> >
> >> Dear Andreia,
> >>
> >> > Dear all,
> >> >
> >> > I am analysing qPCR data from the Exiqon where I have one card per
> >> sample,
> >> > in each card I have one observation for each miRNA. I have in total 8
> >> > cards,
> >> > 2 for treatment 1, 3 for treatment 2 and 3 for treatment 3. Each card
> >> has
> >> > one endogenous gene, which I wouldn't like to use to normalize Ct
> >> values
> >> > because is being affected by the type of treatment. So I would like to
> >> use
> >> > scale.rank.
> >> > I am getting the following error:
> >> >
> >> > sr.norm <- normalizeCtData(raw.cat, norm = "scale.rank")
> >> > Error in smooth.spline(ref[i.set], data[i.set]) :
> >> >   need at least four unique 'x' values
> >> >
> >> It sounds like there aren't enough rank-invariant genes across your 8
> >> cards. If that's the case, then this is admittedly not the most useful
> >> error message, and it should be changed. What does it say when you run
> >> traceback() following the error?
> >>
> >> The parameter "scale.rank.samples" in normalizeCtData() will let you set
> >> how many of the samples each gene has to be rank-invariant across in
> >> order
> >> to be excluded. Per default this is the number of samples-1. You can try
> >> lowering that number, although keeping in mind that the lower it is, the
> >> less robust your resulting rank-invariant genes are. If your samples are
> >> all highly variable across all genes, it might not be possible for you
> >> to
> >> use this normalisation method.
> >>
> >> If this does not seem to be the problem, something else might be going
> >> on
> >> with the function. In that case, please report back here and I can
> >> perhaps
> >> have a look at your data.
> >>
> >> I have been considering adding an additional parameter to
> >> normalizeCtData,
> >> so that genes just have to be rank-invariant within a certain interval,
> >> e.g. be located within -/+5 of each other on the ranked list. For rather
> >> low-throughput qPCR cards that could mess things up though.
> >>
> >> HTH
> >> \Heidi
> >>
> >> > Does this mean I don't have enough replicates?
> >> >
> >> > thanks for the help
> >> >
> >> > Andreia
> >> >
> >> > --
> >> > --------------------------------------------
> >> > Andreia J. Amaral
> >> > Unidade de Imunologia Clínica
> >> > Instituto de Medicina Molecular
> >> > Universidade de Lisboa
> >> > email: andreiaamaral at fm.ul.pt
> >> >           andreia.fonseca at gmail.com
> >> >
> >> >       [[alternative HTML version deleted]]
> >> >
> >> > _______________________________________________
> >> > Bioconductor mailing list
> >> > Bioconductor at r-project.org
> >> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> > Search the archives:
> >> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>
> >>
> >>
> >
> >
> > --
> > --------------------------------------------
> > Andreia J. Amaral
> > Unidade de Imunologia Clínica
> > Instituto de Medicina Molecular
> > Universidade de Lisboa
> > email: andreiaamaral at fm.ul.pt
> >           andreia.fonseca at gmail.com
> >
>
>
>

-- 
--------------------------------------------
Andreia J. Amaral
Unidade de Imunologia Clínica
Instituto de Medicina Molecular
Universidade de Lisboa
email: andreiaamaral at fm.ul.pt
          andreia.fonseca at gmail.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: correlation_between_qnorm_data_cell_code.pdf
Type: application/pdf
Size: 414972 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20110125/7545a2fc/attachment-0002.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: correlation_between_raw_data_cellcode.pdf
Type: application/pdf
Size: 415066 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20110125/7545a2fc/attachment-0003.pdf>