[BioC] Model Design
Sander van der Zeeuw
s.a.j.van_der_zeeuw at lumc.nl
Tue Sep 2 16:28:53 CEST 2014
Hi Jim,
Sorry for replying only to you, i thought that i pressed reply all maybe
i had chosen the wrong button. But to come back to my comparison i have
a RNA-seq data from 4 mouse divided over 2 * 4 samples where one of the
2 is stemcell and the other isn't. I did the comparison of stemcell vs
no stemcell where sample pairing is not important. The last comparison i
want to do is to not look at differences in stemcell and no-stemcell,
but look at the differences between the 4 mouses. Therefore i paired the
samples as seen in the design.
I think that indeed i should use your'e last code: contrast <-
makeContrasts(subject1-subject2, subject1-subject3, subject1-subject4,
subject2-subject3, subject2-subject4, subject3-subject4, levels =
design). Since this is comparing all animals with each other. Thanks for
the help and explanation.
Best,
Sander
On 09/02/2014 04:19 PM, James W. MacDonald wrote:
> Hi Sander,
>
> Please don't take conversations off-list (e.g., use 'Reply all' when
> responding).
>
>
> On Mon, Sep 1, 2014 at 10:59 AM, Sander van der Zeeuw
> <s.a.j.van_der_zeeuw at lumc.nl <mailto:s.a.j.van_der_zeeuw at lumc.nl>> wrote:
>
> Hi Jim,
>
> thanks for the information, i have been struggling for a few days
> now trying to implement your tips. If i want to compare the 8
> samples divided into 4 subjects with no treatment or control or
> what so ever. I have only one model which produces a reasonable
> result. I just do not know what exactly is happening. This is the
> contrast and design model i have been using:
>
>
> This is a problem to start with. I have no idea what 'i want to
> compare the 8 samples divided into 4 subjects with no treatment or
> control or what so ever' means. You will need to be more clear on what
> your goals are for me (or anybody else) to be able to help you.
>
>
> contrast <- makeContrasts(subject1-subject2+subject3-subject4,
> levels = design);
> Contrasts
> Levels subject1 - subject2 + subject3 - subject4
> subject1 1
> subject2 -1
> subject3 1
> subject4 -1
>
>
> This contrast is testing an interaction between subjects 1 and 2
> versus subjects 4 and 3. Without knowing what the subjects are, I
> cannot say if this is a reasonable thing to be doing. But please note
> two things:
>
> 1.) Any comparison in ANOVA is just simple algebra, so if you have
> taken algebra, you should be able to figure out what the comparison is.
> 2.) Given 1.), we can decompose your contrast and see what it is testing:
>
> Your contrast is
> subject1 - subject2 + subject3 - subject4
>
> we can rewrite that as
>
> (subject1 - subject2) - (subject4 - subject3)
>
> so at a certain level, what you are doing is computing the difference
> between subjects 1 and 2, and then comparing that to the difference
> between 4 and 3. If the value in the first parenthesis is pretty close
> to the second, then you won't have much evidence for a difference.
> Alternatively, if the values in the two parentheses are different (and
> larger than expected given the intra-group variability), you will
> likely reject the null hypothesis and say there appears to be a
> difference.
>
> But what does this mean? In your case, I have no idea. In addition, I
> have no idea if you are really (or should be) looking for an
> interaction here. Which gets us back to my first point; what exactly
> (and I mean EXACTLY) are you trying to do?
>
> If you simply want to do all possible comparisons, then you just set
> it up in the most obvious way:
>
> contrast <- makeContrasts(subject1-subject2, subject1-subject3,
> subject1-subject4, subject2-subject3, subject2-subject4,
> subject3-subject4, levels = design)
>
> Best,
>
> Jim
>
>
>
> design <- model.matrix(~ 0 + subject, data = group );
> > design
>
> subject1 subject2 subject3 subject4
> S7309 1 0 0 0
> S7310 1 0 0 0
> S7311 0 1 0 0
> S7312 0 1 0 0
> S7313 0 0 1 0
> S7314 0 0 1 0
> S7315 0 0 0 1
> S7316 0 0 0 1
> attr(,"assign")
> [1] 1 1 1 1
> attr(,"contrasts")
> attr(,"contrasts")$subject
> [1] "contr.treatment"
>
> Do you think that this will be the correct setup now? Because i
> doubt this strongly. If you have any interesting tips please give
> them to me.
>
> Best,
>
> regards,
>
> Sander
>
>
>
> On 08/21/2014 04:12 PM, James W. MacDonald wrote:
>>
>> Hi Sander,
>>
>> On Aug 21, 2014 8:59 AM, "Sander [guest]" <guest at bioconductor.org
>> <mailto:guest at bioconductor.org>> wrote:
>> >
>> > Dear edgeR maintainers,
>> >
>> > I have some troubles setting up the correct design for my DE
>> experiment. I now how my experiment design should look like and
>> that is like this:
>> > > design
>> > subject1 subject2 subject3 subject4
>> > S7309 1 0 0 0
>> > S7310 1 0 0 0
>> > S7311 0 1 0 0
>> > S7312 0 1 0 0
>> > S7313 0 0 1 0
>> > S7314 0 0 1 0
>> > S7315 0 0 0 1
>> > S7316 0 0 0 1
>> > attr(,"assign")
>> > [1] 1 1 1 1
>> > attr(,"contrasts")
>> > attr(,"contrasts")$subject
>> > [1] "contr.treatment"
>> >
>> > After that i make my contrasts:
>> > > makeContrasts(subject1,subject2,subject3,subject4, levels =
>> design)
>>
>> This is where you have made a mistake. Toy should be subtracting
>> one subject from another, depending on the contrasts you care about.
>>
>> The resulting contrasts matrix should have both positive and
>> negative values, not just zeros and ones.
>>
>> See the edgeR users guide or limma users guide for examples.
>>
>> Best,
>>
>> Jim
>>
>> > Contrasts
>> > Levels subject1 subject2 subject3 subject4
>> > subject1 1 0 0 0
>> > subject2 0 1 0 0
>> > subject3 0 0 1 0
>> > subject4 0 0 0 1
>> >
>> > This all looks right since i need to compare the subjects with
>> each other. But when i run my analysis function which looks like
>> this:
>> >
>> > library( edgeR );
>> > library( ggplot2 );
>> > library( reshape );
>> > library( FactoMineR );
>> >
>> > analyse <- function( counts, design, contrast, name, style ) {
>> > counts <- counts[ rowSums( counts, na.rm = TRUE ) > 0, ];
>> > y <- DGEList( counts = counts, genes = rownames( counts ) );
>> > y <- calcNormFactors( y );
>> > y <- estimateGLMCommonDisp( y, design );
>> > y <- estimateGLMTrendedDisp( y, design, df = 5 );
>> > y <- estimateGLMTagwiseDisp( y, design );
>> >
>> > fit <- glmFit( y, design );
>> > lrt <- glmLRT( fit, contrast = contrast );
>> > de <- decideTestsDGE( lrt, p = 0.05, adjust = "BH" );
>> > cpmY <- cpm( y );
>> >
>> > daf <- designAsFactor( design );
>> > orderedDesign <- design[ order( daf, names( daf ) ), ];
>> > tab <- data.frame(
>> > row.names = rownames( cpmY ),
>> > genes = rownames( cpmY ),
>> > de = de,
>> > cpmY[ ,order( daf, names( daf ) ) ]
>> > );
>> >
>> > aRepTab <- topTags( lrt, n = nrow( counts ) )$table;
>> > aRepTab$rank <- 1:nrow( counts );
>> > # repTab <- tab[ match( aRepTab$genes, rownames( tab ) ), ];
>> >
>> > repTab <- merge( aRepTab, tab, by = "genes", sort = FALSE );
>> > repTab <- repTab[ order( repTab$rank ), ];
>> > # data.frame(
>> > # row.names = rownames( aRepTab ),
>> > # aRepTab,
>> > # tab[ match( aRepTab$genes, tab$genes ), ]
>> > # )
>> >
>> > list(
>> > name = name,
>> > y = y,
>> > fit = fit,
>> > lrt = lrt,
>> > de = de,
>> > tab = tab,
>> > style = style,
>> > repTab = repTab,
>> > orderedDesign = orderedDesign
>> > );
>> > }
>> >
>> > I got the following error:
>> >
>> > Error in mglmLevenberg(y, design = design, dispersion =
>> dispersion, offset = offset, :
>> > BLAS/LAPACK routine 'DGEMM ' gave error code -13
>> > 5 mglmLevenberg(y, design = design, dispersion = dispersion,
>> offset = offset,
>> > weights = weights, coef.start = start, maxit = 250)
>> > 4 glmFit.default(glmfit$counts, design = design0, offset =
>> glmfit$offset,
>> > weights = glmfit$weights, dispersion = glmfit$dispersion,
>> > prior.count = 0)
>> > 3 glmFit(glmfit$counts, design = design0, offset = glmfit$offset,
>> > weights = glmfit$weights, dispersion = glmfit$dispersion,
>> > prior.count = 0)
>> > 2 glmLRT(fit, contrast = contrast) at diffexpr.R#15
>> > 1 analyse(counts, design, contrast, countsId, style)
>> >
>> > I tried to use different models but i cannot succeed to avoid
>> the error for this comparison. (other comparisons do succeed) Any
>> hints will be very appreciated.
>> >
>> > Thanks in advance!
>> >
>> > Best regards,
>> >
>> > Sander
>> >
>> > -- output of sessionInfo():
>> >
>> > > sessionInfo()
>> > R version 3.1.0 (2014-04-10)
>> > Platform: x86_64-pc-linux-gnu (64-bit)
>> >
>> > locale:
>> > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>> LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>> LC_MONETARY=en_US.UTF-8
>> > [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8
>> LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
>> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>> >
>> > attached base packages:
>> > [1] splines stats graphics grDevices utils datasets
>> methods base
>> >
>> > other attached packages:
>> > [1] FactoMineR_1.26 reshape_0.8.5 ggplot2_1.0.0
>> reshape2_1.4 edgeR_3.6.7 limma_3.20.8
>> >
>> > loaded via a namespace (and not attached):
>> > [1] car_2.0-20 cluster_1.15.2 colorspace_1.2-4
>> digest_0.6.4 grid_3.1.0 gtable_0.1.2
>> > [7] htmltools_0.2.4 lattice_0.20-29 leaps_2.9
>> MASS_7.3-33 munsell_0.4.2 nnet_7.3-8
>> > [13] plyr_1.8.1 proto_0.3-10 Rcpp_0.11.2
>> rmarkdown_0.2.49 scales_0.2.4 scatterplot3d_0.3-35
>> > [19] stringr_0.6.2 tools_3.1.0 yaml_2.1.13
>> >
>> > --
>> > Sent via the guest posting facility at bioconductor.org
>> <http://bioconductor.org>.
>> >
>> > _______________________________________________
>> > Bioconductor mailing list
>> > Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>> > Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
[[alternative HTML version deleted]]
More information about the Bioconductor
mailing list