[BioC] DESeq2: nested multi-factor design, singular matrix error

Michael Love michaelisaiahlove at gmail.com
Thu Jul 31 21:56:01 CEST 2014


hi BJ,

On Thu, Jul 31, 2014 at 2:54 PM, BJ Chen <bj.j.chen at gmail.com> wrote:
>
> Hi Mike,
>
> Thanks a lot for helping me resolving the test. I have some additional
> questions now after I use the design: ~sample+treatment+sample:treatment.
>
> I can get the contrasts out for the effect of general treatment and that of
> specific cell line. But, I also notice that there are genes with zero read
> count having large log fold change. I searched on the forum and saw posts
> with the same issue in DESeq2 1.2. However, I am using DESeq2 1.4.
>
> I also tried to turn off the betaPrior (I have more than two levels in the
> sample variable), but then the standard matrix type is used. I am not
> entirely clear then how to extract all the contrasts I want. For example,
> now resultsNames() gives me these
>
> Intercept, treatment_treated_vs_untreated, sample_B_vs_A, sample_C_vs_A,
> sample_D_vs_A, treatmenttreated.sampleB, treatmenttreated.sampleC,
> treatmenttreated.sampleD
>
>
> So, I assume the general effect of treatment is from
> treament_treated_vs_untreated. And I assume sampleB's treatment effect can
> be obtained from treatment_treated_vs_untreated + treatmenttreated.sampleB.
> Is this correct? How about sampleA's effect?
>

In the model without the beta prior, treament_treated_vs_untreated is
sample A's effect, because A is the base level. Here, you can get the
general effect of treated vs untreated by taking the effect for the
base level and adding 1/{# levels} of the interaction effects. so:

results(dds, contrast=c(0,1,  0,0,0, 1/4,1/4,1/4))

where this numeric vector corresponds to the names in resultsNames(dds)

Mike

> Thanks!
> BJ
>
>
>
>
>
>
>
>
>
>
> On Tue, Jul 29, 2014 at 12:14 AM, Michael Love <michaelisaiahlove at gmail.com>
> wrote:
>>
>>
>> On Jul 28, 2014 10:50 PM, "BJ Chen" <bj.j.chen at gmail.com> wrote:
>> >
>> >
>> > Thanks, Mike.
>> >
>> > What if I discard the replicate variable (hence losing the "paired"
>> > information for treatment test, but instead using design: ~sample +
>> > treatment + sample:treatment? Will I get both general treatment effect as
>> > well as sample-specific effect?
>> >
>>
>> Yes. This would also work. Check the examples in ?results for how to pull
>> out various contrasts.
>>
>> Mike
>>
>> > Thanks,
>> > BJ
>> >
>> > > On Jul 28, 2014, at 10:22 PM, Michael Love
>> > > <michaelisaiahlove at gmail.com> wrote:
>> > >
>> > > hi BJ,
>> > >
>> > > After looking again at your design, I don't think you have enough
>> > > residual degrees of freedom to estimate treatment effect for each
>> > > sample in addition to the replicate effect. But I can show how to
>> > > estimate the general treatment effect and control for replicate
>> > > effects (and therefore also sample). First I'd recommend you change
>> > > the sample table to reflect that replicates are only within sample,
>> > > e.g.:
>> > >
>> > > sample   replicate  treatment
>> > > A        1          no
>> > > A        2          no
>> > > A        1          yes
>> > > A        2          yes
>> > > B        3          no
>> > > B        4          no
>> > > B        3          yes
>> > > B        4          yes
>> > > ...
>> > >
>> > > Then you can use the design: ~ replicate + treatment. As sample and
>> > > replicate variables are linearly dependent, this controls for the
>> > > sample effect as well.
>> > >
>> > > You can also look into the Multi-level Experiments section of the
>> > > limma package, which describes fitting of the random effect of
>> > > replicate within a model with interactions for each sample and
>> > > treatment.
>> > >
>> > > Mike
>> > >
>> > >> On Mon, Jul 28, 2014 at 4:43 PM, BJ Chen <bj.j.chen at gmail.com> wrote:
>> > >>
>> > >> I am interested in both: the general effect of the treatment as well
>> > >> as the
>> > >> effect of treatment on each sample.
>> > >>
>> > >> Thanks!
>> > >>
>> > >>
>> > >>
>> > >> On Mon, Jul 28, 2014 at 4:41 PM, Michael Love
>> > >> <michaelisaiahlove at gmail.com>
>> > >> wrote:
>> > >>>
>> > >>> Ah that makes sense. Another question, are you interested in the
>> > >>> treatment effect for each different sample? Or are you interested in
>> > >>> the general effect of treatment, controlling for replicate and
>> > >>> sample.
>> > >>>
>> > >>>
>> > >>>> On Mon, Jul 28, 2014 at 4:18 PM, BJ Chen <bj.j.chen at gmail.com>
>> > >>>> wrote:
>> > >>>>
>> > >>>>
>> > >>>> Hi Mike,
>> > >>>>
>> > >>>> Sorry I was not being clear on the experiment design.  Replicate 1
>> > >>>> in
>> > >>>> sample A are different from replicate 1 in sample B. The replicate
>> > >>>> number
>> > >>>> just means there are two replicates for each sample. So the
>> > >>>> experiment was
>> > >>>> done as the following
>> > >>>>
>> > >>>> Sample A -> take one replicate (label A1) --> no treatment
>> > >>>>                                                            \-->
>> > >>>> with
>> > >>>> treatment
>> > >>>> Sample A -> take the 2nd replicate (label A2) --> no treatment
>> > >>>>
>> > >>>> \-->
>> > >>>> with treatment
>> > >>>> Sample B -> take one replicate (label B1) --> no treatment
>> > >>>>                                                            \-->
>> > >>>> with
>> > >>>> treatment
>> > >>>>
>> > >>>> The samples are "paired" in terms that each replicate of a sample
>> > >>>> is
>> > >>>> going through either no treatment or with treatment.
>> > >>>>
>> > >>>> Hopefully this explains it better. If not, please let me know.
>> > >>>>
>> > >>>> Thanks for your help,
>> > >>>> BJ
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> On Mon, Jul 28, 2014 at 4:07 PM, Michael Love
>> > >>>> <michaelisaiahlove at gmail.com> wrote:
>> > >>>>>
>> > >>>>> hi BJ,
>> > >>>>>
>> > >>>>> ignoreRank is only for advanced use, so you don't want to use
>> > >>>>> that.
>> > >>>>>
>> > >>>>> I don't fully understand what replicate 1 and 2 are, can you
>> > >>>>> explain?
>> > >>>>> You say replicate 1 and 2 are paired samples for treatment, but
>> > >>>>> they are
>> > >>>>> both also sample A? How is replicate 1 sample A related to
>> > >>>>> replicate 1
>> > >>>>> sample B?
>> > >>>>>
>> > >>>>> Mike
>> > >>>>>
>> > >>>>>
>> > >>>>>> On Mon, Jul 28, 2014 at 3:50 PM, BJ Chen <bj.j.chen at gmail.com>
>> > >>>>>> wrote:
>> > >>>>>>
>> > >>>>>> Hi,
>> > >>>>>>
>> > >>>>>> I am trying to run DESeq2 analysis on sample design like this:
>> > >>>>>>
>> > >>>>>> sample   replicate  treatment
>> > >>>>>> A        1          no
>> > >>>>>> A        2          no
>> > >>>>>> A        1          yes
>> > >>>>>> A        2          yes
>> > >>>>>> B        1          no
>> > >>>>>> B        2          no
>> > >>>>>> B        1          yes
>> > >>>>>> B        2          yes
>> > >>>>>> (repeated for 4 different samples. eg. A-D).
>> > >>>>>>
>> > >>>>>> The interests of DE effect include treatment on each sample and
>> > >>>>>> treatment
>> > >>>>>> over all.
>> > >>>>>>
>> > >>>>>> I have searched online and found previously suggested model as
>> > >>>>>> ~ treatment + sample:replicate + sample:treatment.
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> However, when I called DESeqDataSetFromMatrix(readcount,
>> > >>>>>> sampleinfo,
>> > >>>>>> ~treatment+sample:replicate+sample:treatment), it first
>> > >>>>>> complained the
>> > >>>>>> matrix is not full rank.
>> > >>>>>>
>> > >>>>>> I tried ignoreRank option, but then I got error when I called
>> > >>>>>> DESeq()
>> > >>>>>> (default parameters):
>> > >>>>>>
>> > >>>>>> error: inv(): matrix appears to be singular
>> > >>>>>>                                              Error in eval(expr,
>> > >>>>>> envir,
>> > >>>>>> enclos) : inv(): matrix appears to be singular
>> > >>>>>>        In addition: Warning message:
>> > >>>>>>                                                      In
>> > >>>>>> fitNbinomGLMs(objectNZ[fitidx, , drop = FALSE], alpha_hat =
>> > >>>>>> alpha_hat[fitidx],  :                25rows had non-positive
>> > >>>>>> estimates
>> > >>>>>> of
>> > >>>>>> variance for coefficients, likely due to rank deficient model
>> > >>>>>> matrices
>> > >>>>>> without betaPrior
>> > >>>>>>
>> > >>>>>> If I exclude the replicate ( eg.
>> > >>>>>> design=~treatment+sample+sample:treatment), it run through
>> > >>>>>> without
>> > >>>>>> errors.
>> > >>>>>> However, I would like to take into account the replicates, as
>> > >>>>>> they are
>> > >>>>>> paired samples for the treatment.
>> > >>>>>>
>> > >>>>>> I will appreciate any help/suggestions.
>> > >>>>>>
>> > >>>>>> Thanks,
>> > >>>>>> BJ
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> Session info is included in the bottom.
>> > >>>>>>
>> > >>>>>> R
>> > >>>>>> versio(2014-04-10)4-04-10)
>> > >>>>>>
>> > >>>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>                        attached base packages:
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> [1] parallel  stats     graphics  grDevices utils     datasets
>> > >>>>>> methods
>> > >>>>>>                                  [8] base
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> other attached
>> > >>>>>> packages:
>> > >>>>>>
>> > >>>>>> [1] DESeq2_1.4.5            RcppArmadillo_0.4.200.0
>> > >>>>>> Rcpp_0.11.1
>> > >>>>>>
>> > >>>>>> [4] GenomicRanges_1.14.4    XVector_0.2.0
>> > >>>>>> IRanges_1.20.7
>> > >>>>>>
>> > >>>>>> [7] BiocGenerics_0.8.0
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>                                  loaded via a namespace (and not
>> > >>>>>> attached):
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> [1] annotate_1.40.1      AnnotationDbi_1.24.0
>> > >>>>>> Biobase_2.22.0
>> > >>>>>>
>> > >>>>>> [4] DBI_0.2-7            genefilter_1.44.0
>> > >>>>>> geneplotter_1.40.0
>> > >>>>>>
>> > >>>>>> [7] grid_3.1.0           lattice_0.20-29      locfit_1.5-9.1
>> > >>>>>>                                             [10]
>> > >>>>>> RColorBrewer_1.0-5
>> > >>>>>> RSQLite_0.11.4       splines_3.1.0
>> > >>>>>>         [13] stats4_3.1.0         survival_2.37-7
>> > >>>>>> XML_3.98-1.1
>> > >>>>>>                                                [16] xtable_1.7-3
>> > >>>>>>
>> > >>>>>>        [[alternative HTML version deleted]]
>> > >>>>>>
>> > >>>>>> _______________________________________________
>> > >>>>>> Bioconductor mailing list
>> > >>>>>> Bioconductor at r-project.org
>> > >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> > >>>>>> Search the archives:
>> > >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> > >>
>> > >>
>
>



More information about the Bioconductor mailing list