[BioC] Model Design

Sander van der Zeeuw s.a.j.van_der_zeeuw at lumc.nl
Tue Sep 2 16:28:53 CEST 2014


Hi Jim,

Sorry for replying only to you, i thought that i pressed reply all maybe 
i had chosen the wrong button.  But to come back to my comparison i have 
a RNA-seq data from 4 mouse divided over 2 * 4 samples where one of the 
2 is stemcell and the other isn't. I did the comparison of stemcell vs 
no stemcell where sample pairing is not important. The last comparison i 
want to do is to not look at differences in stemcell and no-stemcell, 
but look at the differences between the 4 mouses. Therefore i paired the 
samples as seen in the design.

I think that indeed i should use your'e last code: contrast <- 
makeContrasts(subject1-subject2, subject1-subject3, subject1-subject4, 
subject2-subject3, subject2-subject4, subject3-subject4, levels = 
design). Since this is comparing all animals with each other. Thanks for 
the help and explanation.

Best,

Sander

On 09/02/2014 04:19 PM, James W. MacDonald wrote:
> Hi Sander,
>
> Please don't take conversations off-list (e.g., use 'Reply all' when 
> responding).
>
>
> On Mon, Sep 1, 2014 at 10:59 AM, Sander van der Zeeuw 
> <s.a.j.van_der_zeeuw at lumc.nl <mailto:s.a.j.van_der_zeeuw at lumc.nl>> wrote:
>
>     Hi Jim,
>
>     thanks for the information, i have been struggling for a few days
>     now trying to implement your tips. If i want to compare the 8
>     samples divided into 4 subjects with no treatment or control or
>     what so ever. I have only one model which produces a reasonable
>     result. I just do not know what exactly is happening. This is the
>     contrast and design model i have been using:
>
>
> This is a problem to start with. I have no idea what 'i want to 
> compare the 8 samples divided into 4 subjects with no treatment or 
> control or what so ever' means. You will need to be more clear on what 
> your goals are for me (or anybody else) to be able to help you.
>
>
>     contrast <- makeContrasts(subject1-subject2+subject3-subject4,
>     levels = design);
>               Contrasts
>     Levels     subject1 - subject2 + subject3 - subject4
>       subject1                                         1
>       subject2                                        -1
>       subject3                                         1
>       subject4                                        -1
>
>
> This contrast is testing an interaction between subjects 1 and 2 
> versus subjects 4 and 3. Without knowing what the subjects are, I 
> cannot say if this is a reasonable thing to be doing. But please note 
> two things:
>
> 1.) Any comparison in ANOVA is just simple algebra, so if you have 
> taken algebra, you should be able to figure out what the comparison is.
> 2.) Given 1.), we can decompose your contrast and see what it is testing:
>
> Your contrast is
> subject1 - subject2 + subject3 - subject4
>
> we can rewrite that as
>
> (subject1 - subject2) - (subject4 - subject3)
>
> so at a certain level, what you are doing is computing the difference 
> between subjects 1 and 2, and then comparing that to the difference 
> between 4 and 3. If the value in the first parenthesis is pretty close 
> to the second, then you won't have much evidence for a difference. 
> Alternatively, if the values in the two parentheses are different (and 
> larger than expected given the intra-group variability), you will 
> likely reject the null hypothesis and say there appears to be a 
> difference.
>
> But what does this mean? In your case, I have no idea. In addition, I 
> have no idea if you are really (or should be) looking for an 
> interaction here. Which gets us back to my first point; what exactly 
> (and I mean EXACTLY) are you trying to do?
>
> If you simply want to do all possible comparisons, then you just set 
> it up in the most obvious way:
>
> contrast <- makeContrasts(subject1-subject2, subject1-subject3, 
> subject1-subject4, subject2-subject3, subject2-subject4, 
> subject3-subject4, levels = design)
>
> Best,
>
> Jim
>
>
>
>     design <- model.matrix(~ 0 + subject, data = group );
>     > design
>
>           subject1 subject2 subject3 subject4
>     S7309        1        0        0        0
>     S7310        1        0        0        0
>     S7311        0        1        0        0
>     S7312        0        1        0        0
>     S7313        0        0        1        0
>     S7314        0        0        1        0
>     S7315        0        0        0        1
>     S7316        0        0        0        1
>     attr(,"assign")
>     [1] 1 1 1 1
>     attr(,"contrasts")
>     attr(,"contrasts")$subject
>     [1] "contr.treatment"
>
>     Do you think that this will be the correct setup now? Because i
>     doubt this strongly. If you have any interesting tips please give
>     them to me.
>
>     Best,
>
>     regards,
>
>     Sander
>
>
>
>     On 08/21/2014 04:12 PM, James W. MacDonald wrote:
>>
>>     Hi Sander,
>>
>>     On Aug 21, 2014 8:59 AM, "Sander [guest]" <guest at bioconductor.org
>>     <mailto:guest at bioconductor.org>> wrote:
>>     >
>>     > Dear edgeR maintainers,
>>     >
>>     > I have some troubles setting up the correct design for my DE
>>     experiment. I now how my experiment design should look like and
>>     that is like this:
>>     > > design
>>     >       subject1 subject2 subject3 subject4
>>     > S7309        1        0        0        0
>>     > S7310        1        0        0        0
>>     > S7311        0        1        0        0
>>     > S7312        0        1        0        0
>>     > S7313        0        0        1        0
>>     > S7314        0        0        1        0
>>     > S7315        0        0        0        1
>>     > S7316        0        0        0        1
>>     > attr(,"assign")
>>     > [1] 1 1 1 1
>>     > attr(,"contrasts")
>>     > attr(,"contrasts")$subject
>>     > [1] "contr.treatment"
>>     >
>>     > After that i make my contrasts:
>>     > > makeContrasts(subject1,subject2,subject3,subject4, levels =
>>     design)
>>
>>     This is where you have made a mistake. Toy should be subtracting
>>     one subject from another, depending on the contrasts you care about.
>>
>>     The resulting contrasts matrix should have both positive and
>>     negative values, not just zeros and ones.
>>
>>     See the edgeR users guide or limma users guide for examples.
>>
>>     Best,
>>
>>     Jim
>>
>>     >           Contrasts
>>     > Levels     subject1 subject2 subject3 subject4
>>     >   subject1        1        0        0        0
>>     >   subject2        0        1        0        0
>>     >   subject3        0        0        1        0
>>     >   subject4        0        0        0        1
>>     >
>>     > This all looks right since i need to compare the subjects with
>>     each other. But when i run my analysis function which looks like
>>     this:
>>     >
>>     > library( edgeR );
>>     > library( ggplot2 );
>>     > library( reshape );
>>     > library( FactoMineR );
>>     >
>>     > analyse <- function( counts, design, contrast, name, style ) {
>>     >   counts <- counts[ rowSums( counts, na.rm = TRUE ) > 0, ];
>>     >   y <- DGEList( counts = counts, genes = rownames( counts ) );
>>     >   y <- calcNormFactors( y );
>>     >   y <- estimateGLMCommonDisp( y, design );
>>     >   y <- estimateGLMTrendedDisp( y, design, df = 5 );
>>     >   y <- estimateGLMTagwiseDisp( y, design );
>>     >
>>     >   fit <- glmFit( y, design );
>>     >   lrt <- glmLRT( fit, contrast = contrast );
>>     >   de <- decideTestsDGE( lrt, p = 0.05, adjust = "BH" );
>>     >   cpmY <- cpm( y );
>>     >
>>     >   daf <- designAsFactor( design );
>>     >   orderedDesign <- design[ order( daf, names( daf ) ), ];
>>     >   tab <- data.frame(
>>     >     row.names = rownames( cpmY ),
>>     >     genes = rownames( cpmY ),
>>     >     de = de,
>>     >     cpmY[ ,order( daf, names( daf ) ) ]
>>     >   );
>>     >
>>     >   aRepTab <- topTags( lrt, n = nrow( counts ) )$table;
>>     >   aRepTab$rank <- 1:nrow( counts );
>>     > #  repTab <- tab[ match( aRepTab$genes, rownames( tab ) ), ];
>>     >
>>     >   repTab <- merge( aRepTab, tab, by = "genes", sort = FALSE );
>>     >   repTab <- repTab[ order( repTab$rank ), ];
>>     > #  data.frame(
>>     > #    row.names = rownames( aRepTab ),
>>     > #    aRepTab,
>>     > #    tab[ match( aRepTab$genes, tab$genes ), ]
>>     > #  )
>>     >
>>     >   list(
>>     >     name = name,
>>     >     y = y,
>>     >     fit = fit,
>>     >     lrt = lrt,
>>     >     de = de,
>>     >     tab = tab,
>>     >     style = style,
>>     >     repTab = repTab,
>>     >     orderedDesign = orderedDesign
>>     >   );
>>     > }
>>     >
>>     > I got the following error:
>>     >
>>     > Error in mglmLevenberg(y, design = design, dispersion =
>>     dispersion, offset = offset,  :
>>     >   BLAS/LAPACK routine 'DGEMM ' gave error code -13
>>     > 5 mglmLevenberg(y, design = design, dispersion = dispersion,
>>     offset = offset,
>>     >     weights = weights, coef.start = start, maxit = 250)
>>     > 4 glmFit.default(glmfit$counts, design = design0, offset =
>>     glmfit$offset,
>>     >     weights = glmfit$weights, dispersion = glmfit$dispersion,
>>     >     prior.count = 0)
>>     > 3 glmFit(glmfit$counts, design = design0, offset = glmfit$offset,
>>     >     weights = glmfit$weights, dispersion = glmfit$dispersion,
>>     >     prior.count = 0)
>>     > 2 glmLRT(fit, contrast = contrast) at diffexpr.R#15
>>     > 1 analyse(counts, design, contrast, countsId, style)
>>     >
>>     > I tried to use different models but i cannot succeed to avoid
>>     the error for this comparison. (other comparisons do succeed) Any
>>     hints will be very appreciated.
>>     >
>>     > Thanks in advance!
>>     >
>>     > Best regards,
>>     >
>>     > Sander
>>     >
>>     >  -- output of sessionInfo():
>>     >
>>     > > sessionInfo()
>>     > R version 3.1.0 (2014-04-10)
>>     > Platform: x86_64-pc-linux-gnu (64-bit)
>>     >
>>     > locale:
>>     >  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C          
>>      LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8   
>>      LC_MONETARY=en_US.UTF-8
>>     >  [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8     
>>      LC_NAME=C   LC_ADDRESS=C               LC_TELEPHONE=C
>>     > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>     >
>>     > attached base packages:
>>     > [1] splines   stats     graphics  grDevices utils     datasets 
>>     methods   base
>>     >
>>     > other attached packages:
>>     > [1] FactoMineR_1.26 reshape_0.8.5  ggplot2_1.0.0 
>>      reshape2_1.4    edgeR_3.6.7  limma_3.20.8
>>     >
>>     > loaded via a namespace (and not attached):
>>     >  [1] car_2.0-20           cluster_1.15.2  colorspace_1.2-4   
>>      digest_0.6.4  grid_3.1.0           gtable_0.1.2
>>     >  [7] htmltools_0.2.4      lattice_0.20-29 leaps_2.9           
>>     MASS_7.3-33 munsell_0.4.2        nnet_7.3-8
>>     > [13] plyr_1.8.1           proto_0.3-10  Rcpp_0.11.2         
>>     rmarkdown_0.2.49  scales_0.2.4         scatterplot3d_0.3-35
>>     > [19] stringr_0.6.2        tools_3.1.0 yaml_2.1.13
>>     >
>>     > --
>>     > Sent via the guest posting facility at bioconductor.org
>>     <http://bioconductor.org>.
>>     >
>>     > _______________________________________________
>>     > Bioconductor mailing list
>>     > Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>>     > https://stat.ethz.ch/mailman/listinfo/bioconductor
>>     > Search the archives:
>>     http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>
>
>
> -- 
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099


	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list