[BioC] DESeq2, problems using list argument to extract results from DESeq

Jon Bråte jon.brate at ibv.uio.no
Wed Jul 30 20:23:23 CEST 2014


  
Simon Anders <anders at ...> writes:

> 
> Hi Jon
> 
> On 11/07/14 10:36, Jon Bråte [guest] wrote:
> > I have gene expression from several embryogenesis stages as well as
non-reproductive stages. I have done
> DESeq on my DESeqDataSet and want to extract the results from all the
embryogenesis stages vs non-reproductive.
> >
> > Heres the error I get:
> >
> >> resNonReprVsAll = results(dds, contrast=list(c("Early vitellogenesis",
"Late vitellogenesis",
> "Fertilization", "Early cleavage", "Late cleavage", "Early preinversion",
"Late preinversion",
> "Early postinversion", "Late postinversion"), "Non reproductive"))
> > Error in cleanContrast(object, contrast, expanded = isExpanded,
listValues = listValues) :
> >    all elements of the contrast as a list of length 2 should be elements
of 'resultsNames(object)'
> 
> The 'results' function gives you a results table for _one_ contrast. If 
> you want several contrasts, you have to call 'results' several times.
> 
> Also note that 'results' now supports two ways of specifying contrasts: 
> For main effects, we suggest that you use the new format of passing a 
> vector with three strings, namely the factor which you want to contrast 
> and then the two levels you want to compare, e.g.,
> 
> results( dds, contrast = c( "condition",
>     "Fertilization", "Non reproductive" ) )
> # (or perhaps "Non.reproductive", depending how your level is
> # actually names)
> 
> The old way of specifying _two_ elements from 'resultsNames' needs to be 
> used only for more complicated contrasts, e.g., those involving 
> interactions.
> 
> If this is unclear, ask again, and explain the biology of your 
> experiment and what contrasts you are actually interested in.
> 
>    Simon
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at ...
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> 


Dear Simon,

thank you for your answer. What I am interested in is to find genes that
possibly play a part in embryogenesis. In other words, genes that are
up-regulated in any of the embryogenesis stages compared to the
non-reproductive stages. 

I have tried three different approaches, but struggle to understand what
they mean, how they differ, and if they actually make sense. I try to
explain what I did, sorry if it is difficult to read:


First I tried to contrast all the embryogenesis stages compared to the non
reproductive using the Wald test in DESeq:

dds = DESeq(dds)
resCtrst = results(dds, contrast=list(c("conditionEarly.vitellogenesis",
"conditionLate.vitellogenesis", "conditionFertilization",
"conditionEarly.cleavage",
                                        "conditionLate.cleavage",
"conditionEarly.preinversion", "conditionLate.preinversion",
                                        "conditionEarly.postinversion",
"conditionLate.postinversion"), "conditionNon.reproductive"))
resCtrst = resCtrst[order(resCtrst$padj),]
resSigCtrst = resCtrst[which(resCtrst$padj < 0.1), ]
resSigCtrstPositive = resSigCtrst[which(resSigCtrst$log2FoldChange > 0), ]
write.csv2(resSigCtrstPositive,
file="DESeq2_Wald_test_Sig_upreg_all_cond_vs_non_repr.csv")

This gave me 1736 upregulated genes.


Second I ran an LTR test like this:

ddsLRT = DESeq(dds, test="LRT", full=~condition, reduced=~1)
resLRT = results(ddsLRT)
resLRT = resLRT[order(resLRT$padj),]
resSigLRT = resLRT[which(resLRT$padj < 0.1), ]
resSigLRTPositive = resSigLRT[which(resSigLRT$log2FoldChange > 0), ]
write.csv2(resSigLRT, file="DESeq2_Sig_LRT_upreg.csv")

This gave me 3158 upregulated genes (all genes have positive log2fold
changes so I guess I cannot see which are up or down regulated?)

And I don't understand exactly what is contained in resLRT, are the
significant genes here significantly expressed between Non reproductive
samples and all others compared to the null hypothesis? Or is it just for
the two conditions Late.postinversion vs Non reproductive?:

mcols(resLRT)
DataFrame with 6 rows and 2 columns
          type                                                       
description
   <character>                                                       
<character>
1 intermediate                                        the base mean over all
rows
2      results log2 fold change: condition Late.postinversion vs
Non.reproductive
3      results   standard error: condition Late.postinversion vs
Non.reproductive
4      results                              LRT statistic: '~ condition' vs
'~ 1'
5      results                                LRT p-value: '~ condition' vs
'~ 1'
6      results                                               BH adjusted
p-values


Third I tried to compare the Wald test and the LTR:

res = results(dds) #should I use resCtrst instead of res??
tab = table(Wald=res$padj < .1, LRT=resLRT$padj < .1) 
addmargins(tab)

       LRT
Wald    FALSE TRUE  Sum
  FALSE  1027 1766 2793
  TRUE    526  747 1273
  Sum    1553 2513 4066


sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
base     

other attached packages:
[1] DESeq2_1.4.5            RcppArmadillo_0.4.320.0 Rcpp_0.11.2            
GenomicRanges_1.16.3   
[5] GenomeInfoDb_1.0.2      IRanges_1.22.9          BiocGenerics_0.10.0    

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.26.0 Biobase_2.24.0       DBI_0.2-7           
RColorBrewer_1.0-5   RSQLite_0.11.4      
 [6] XML_3.98-1.1         XVector_0.4.0        annotate_1.42.0     
genefilter_1.46.1    geneplotter_1.42.0  
[11] grid_3.1.0           lattice_0.20-29      locfit_1.5-9.1      
splines_3.1.0        stats4_3.1.0        
[16] survival_2.37-7      tools_3.1.0          xtable_1.7-3



More information about the Bioconductor mailing list