[BioC] Problem elaborating contrasts for DESeq2 analysis
Marine Rohmer
Marine.Rohmer at mgx.cnrs.fr
Tue Jul 22 17:18:20 CEST 2014
Hi Michael,
First of all thank you a lot for your answer!
So you think that the "_vs_" names are only due to the older version ?
On my workplace I can't update these packages but I will try to install
them on my own computer in the latest version, and see if it goes
better.
Waiting for that, I've tried as you suggested :
dds <-
DESeqDataSetFromMatrix(countData=d2$counts,colData=descexp,design=~condition+0)
dds <- DESeq(dds, betaPrior=FALSE)
This second line turns but never ends (At least within 1 hour).
Thank you for your attention,
Best regards,
Marine
Le 2014-07-21 16:32, Michael Love a écrit :
> hi Marine,
>
> Note that you are using an older version of DESeq2 and Bioconductor
> (DESeq2 v1.4 was released in April 2014). So if you are searching by
> Google for documentation, it might not apply to your software (we
> print the version number on the first page of the vignettes). You can
> always get the correct documentation by using:
> browseVignettes("DESeq2") and ?results for the man page for the
> results() function.
>
> If you can update to the release version of Bioconductor (by updating
> to R 3.1.0), you can obtain the contrast you are interested in with:
>
> results(dds,
> contrast=list(c("conditionA1","conditionA2","conditionA3"),
> c("conditionB1","conditionB2","conditionB3")), listValues=c(1/3,
> -1/3))
>
> If you aren't able to update to R 3.1.0 and the release version of
> Bioconductor, you can hack your contrast of interest for v1.2 like so:
>
> design(dds) <- ~ condition + 0
> dds <- DESeq(dds, betaPrior=FALSE)
> # note that the resultsNames(dds) will not be labelled correctly
> # with design having + 0 (they still have the _vs_ in the names)
> # but the following provides the correct contrast
> attr(dds, "modelMatrix")
> results(dds, contrast = c(1,1,1,-1,-1,-1,0,0,0) / 3)
>
> Mike
>
> On Mon, Jul 21, 2014 at 4:46 AM, Marine Rohmer
> <Marine.Rohmer at mgx.cnrs.fr> wrote:
>> Dear bioconductor mailing list,
>>
>> I want to do a differential analysis using the GLM with DESeq2.
>> We have 18 RNA samples, including 6 samples subjected to treatment A,
>> 6
>> others subjected to treatment B, and the last 6 subjected to treatment
>> C.
>> Each treatment group is divided into 3 colonies, in which one we have
>> 2
>> replicates (leading us to 6 samples per treatment).
>> Colonies A1, A2 and A3 have been treated with the A treatment,
>> colonies B*
>> with the B treatment, and colonies C* with the C treatment.
>> (A picture of the experiment design is attached to this email).
>>
>> These treatments are what we want to study, being aware of the "colony
>> effect".
>>
>> It is important to note that both factors "colony" and "treatment" are
>> in
>> linear correlation, so that the classic contrast "A - B" does not work
>> (-->
>> "design matrix not of full rank").
>> That's why I want to study, for example, the contrast : (A1+A2+A3 -
>> B1-B2-B3)/3 .
>>
>> I've followed the DESeq2 documentation, and it seems I can do that
>> with the
>> numeric contrast vector as described in the 3.2. section.
>> My problem is that when typing "resultsNames(ddsCtrst)", I don't have
>> "A1",
>> "A2", "A3" etc... as described in the doc, but directly some
>> "A2_vs_A1",
>> "A2_vs_A1" etc... so that I can't do my numeric contrast vector as I
>> wanted.
>> I don't know why I have these "_vs_something" because I did not write
>> them
>> myself.
>>
>> This is my code :
>>
>> ### Preparing the analysis
>> library(DESeq2)
>> # Using edgeR to have the readDGE function
>> library(edgeR)
>>
>> # fileT is my target.txt tabulated file, containing :
>> # files group description colony
>> # path/to/countsFile.txt A A1_1 A1
>> # path/to/countsFile.txt A A1_2 A1
>> # path/to/countsFile.txt A A2_1 A2
>> # etc...
>>
>> targets <- read.delim(file = fileT, stringsAsFactors = FALSE,header =
>> TRUE,
>> comment.char="" )
>> d2 <- readDGE(targets, columns=c(1,2), header = FALSE)
>> d.RLE <- d2
>> d.RLE <- calcNormFactors(d.RLE,method="RLE")
>>
>> ### Preparing the design and contrast
>> descexp=c(d2$samples$colony)
>> # "condition" is my colony factor "A1", "A2", "A3", "B1", "B2" etc...
>> descexp=data.frame(condition=descexp)
>>
>>> descexp
>>
>> condition
>> 1 A1
>> 2 A1
>> 3 A2
>> 4 A2
>> 5 A3
>> 6 A3
>> 7 B1
>> 8 B1
>> 9 B2
>> 10 B2
>> 11 B3
>> 12 B3
>> 13 C1
>> 14 C1
>> 15 C2
>> 16 C2
>> 17 C3
>> 18 C3
>>
>> dds <- DESeqDataSetFromMatrix(countData =
>> d2$counts,colData=descexp,design=~condition)
>>
>> ### Checking the condition : each colony in two replicates
>>>
>>> dds$condition
>>
>> [1] A1 A1 A2 A2 A3 A3 B1 B1 B2 B2 B3 B3 C1 C1 C2 C2 C3 C3
>> Levels: A1 A2 A3 B1 B2 B3 C1 C2 C3
>>
>> ### Analysis
>> dds <- DESeq(dds)
>>
>> ### Now when typing resultsNames(dds) :
>>>
>>> resultsNames(dds)
>>
>> [1] "Intercept" "condition_A2_vs_A1" "condition_A3_vs_A1"
>> [4] "condition_B1_vs_A1" "condition_B2_vs_A1" "condition_B3_vs_A1"
>> [7] "condition_C1_vs_A1" "condition_C2_vs_A1" "condition_C3_vs_A1"
>>
>> ### Whereas I want something like described in the DESeq2 doc, that is
>> to
>> say without the "_vs_something"...
>>
>> Does anyone know what is wrong is my code (or my experiment design) ?
>> I hope my message is understable. Don't hesitate to ask me other
>> details.
>>
>> Thank you for advance for your attention,
>>
>> Best regards,
>>
>> M. R.
>>
>> NB :
>>
>>> sessionInfo()
>>
>> R version 3.0.2 (2013-09-25)
>> Platform: x86_64-redhat-linux-gnu (64-bit)
>>
>> locale:
>> [1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
>> [3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
>> [5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
>> [7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] parallel stats graphics grDevices utils datasets
>> methods
>> [8] base
>>
>> other attached packages:
>> [1] edgeR_3.4.0 limma_3.18.2 DESeq2_1.2.5
>> [4] RcppArmadillo_0.3.920.1 Rcpp_0.10.6
>> GenomicRanges_1.14.3
>> [7] XVector_0.2.0 IRanges_1.20.5 BiocGenerics_0.8.0
>>
>> loaded via a namespace (and not attached):
>> [1] annotate_1.40.0 AnnotationDbi_1.24.0 Biobase_2.22.0
>> [4] DBI_0.2-7 genefilter_1.44.0 grid_3.0.2
>> [7] lattice_0.20-24 locfit_1.5-9.1 RColorBrewer_1.0-5
>> [10] RSQLite_0.11.4 splines_3.0.2 stats4_3.0.2
>> [13] survival_2.37-4 tools_3.0.2 XML_3.98-1.1
>> [16] xtable_1.7-1
More information about the Bioconductor
mailing list