[BioC] DESeq - Estimating Dispersion with Technical Replicates

Sat Sep 15 14:29:14 CEST 2012

Dear Andrés

thank you for your report. From the error message that you sent, it 
seems that you are using an older version of DESeq. Can you update to 
the latest version (ideally [1], but at least the latest release, 
version 1.8.3).

You say "according to the user guide provided by Simon Anders, we must 
*sum up* their counts to get a single column ... I end up with two 
columns: each one representing a condition that has the *mean* counts 
from the two technical replicates...."
Please do use the sum, not the mean.

Then, please consider the vignette [2], which addresses your use-case in 
Section 3.3 "Working without any replicates" and recommends this code:

  cds2 = estimateDispersions( cds2, method="blind",
                       sharingMode="fit-only" )

If your problem persists, please send the output of 'sessionInfo()' in 
your next report. (So we don't need to chase after problems that already 
have been fixed.)

[1] http://www.bioconductor.org/packages/devel/bioc/html/DESeq.html
[2] 
http://www.bioconductor.org/packages/devel/bioc/vignettes/DESeq/inst/doc/DESeq.pdf

	Best wishes
	Wolfgang

Sep/15/12 6:14 AM, Andres Eduardo Rodriguez Cubillos scripsit:
> Good day everyone,
>
> My name is Andrés. I'm from Universidad de los Andes located in
> Bogota D.C. (Colombia) and am currently using the DESeq package to
> analyze differential gene expression between two experimental
> conditions.
>
> I attach an example of the countData format I'm using to run the
> analysis in DESeq. Each column represents a treatment, or condition,
> that has the mean counts of two technical replicates; each row
> represents the FPKMs (count reads) obtained from CuffCompare after
> our RNA-seq data was processed through Bowtie and Cufflinks.
>
> In our experiment we used a technical replicate for each condition
> and, according to the user guide provided by Simon Anders, we must
> sum up their counts to get a single column corresponding to a unique
> biological replicate. At the end I end up with two columns: each one
> representing a condition that has the mean counts from the two
> technical replicates of that condition. It's important to say that we
> do not have any biological replicates, only technical replicates.
>
> Everything appears to be going fine until we try to estimate the
> dispersion of the normalized counts... an error message appears
> indicating that "X must be an array of at least two dimensions". I
> attach my results and the error message.
>
> I hope you can help us solve this issue. We're thinking this error
> might be related to the fact that we only have one column for each
> condition: one for "treated" and one for "untreated". In the
> countData from the guide I see there's more than one column for each
> condition: "treated2", "treated3" and "untreated3", "untreated4"
> (Guide: Analysing RNA-seq Data with the DESeq Package from
> 2012-03-16). However, if we only want to compare two conditions that
> only have technical replicates, we can only produce one column per
> condition because we must sum up both technical replicates into one
> column.
>
> We'll be glad to hear from you and appreciate any advice you can give
> us.
>
> Best regards,
>
>
> Andrés Rodríguez
>
> LAMFU Universidad de los Andes Bogotá D.C. Colombia
>
>
>
> _______________________________________________ Bioconductor mailing
> list Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
> archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>

-- 
Best wishes
	Wolfgang

Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber