[BioC] Contrast among a subset of interactions, glm in edgeR

Maximo Rivarola rivarola.maximo at inta.gob.ar
Fri Jun 20 11:25:48 CEST 2014


HI Gordon, thanks for your reply,
I guess I was not clear in my question, 
I am by NO means an expert in stats, ... :) as it shows....

I agree with your idea of making the groups and be done... I will do this!  I was not sure if that would be correct as of the model and so...
I did want to mention that my "idea" was to be able to contrast the following:

(cepaHA853t4INOC - cepaHA853t4Control) - (cepa853t8INOC - cepa853t8Control)

In my case, time points make a big difference and not so much inoculated, thats why this type of contrasts...

I guess i wanted to see the diff from 4day to 8 day inoculated  but not genes diff expre just because of time.
Thanks again!!!!
I will have to take a course in stats again,

great program and am learning  a lot,
cheers,
maximo


--------------------------------------------------------
Maximo Rivarola PhD.
________________________________________
From: Gordon K Smyth [smyth at wehi.EDU.AU]
Sent: Friday, June 20, 2014 5:46 AM
To: Maximo Rivarola
Cc: Bioconductor mailing list
Subject: Re: Contrast among a subset of interactions, glm in edgeR

Dear Maximo,

You ask why you don't see "all" the interactions as columns in your design
matrix.  Someone asks a variation of this question every week or so on
Bioconductor.  The confusion comes from a mis-understanding of what
interactions and contrasts actually are.

You say that you want to compare the interaction
"cepaHA853:tiempo4:infinoc - cepaHA853:tiempo8:infinoc", but I am pretty
sure that you actually do not want to do that.  Do you simply mean that
you want to compare time points 8 and 4 for genotype HA853 innoculated? I
am guessing that this is what you are actually after.

You cannot answer questions of this type by fitting an interaction model.
Instead of fitting the complicated and esoteric 3-way factorial model, it
would be simpler and more correct to follow the advice of Section 3.3.1 of
the edgeR User's Guide.  Simply combine the factors cepa, tiempo and inof
into a single factor with 18 levels.  Fit the model ~0+CombinedFactor,
then test the contrast:

  HA853.4.inoc - HA853.8.inoc

This is simple and intuitive as well as being correct.

Best wishes
Gordon


> Maximo Rivarola rivarola.maximo at inta.gob.ar
> Tue Jun 17 18:28:17 CEST 2014
>
> I am writing my first email to this list. I have a question in regards to
> contrasting a few interactions...
> I have an RNA-seq experiment with the following structure:
> - 3 cepa (genotypes)
> - 3 tiempo (time points)
> - inf (Inoculated or not)
>
> Basically, three different genotypes were infected or not (inf) and then 3
> time points were take and sequenced. Most samples have 2 biol replicates and
> some 4 biol replicates.
>
> My question is the following:
> I would like to make a contrast were I compare the interaction of
> cepaHA853:tiempo4:infinoc - cepaHA853:tiempo8:infinoc  ????
> My problem is I do not know where the column for this interaction is???
> I placed the 0 as for it not to intercept in the model, but the later factors
> are always shown from level 2 on...
> How can I "see" all the interaction from my design???
>
> Below is some code:
> Thanks in advanced!!! cheers,
>
> Design implemented:
> design7 <- model.matrix(~ 0 + cepa*tiempo*inf +  linea)
>
>> dim(design7)
> [1] 56 24
>
>> head(design7)
>  cepaHA853 cepaHA89 cepaRK416 tiempo4 tiempo8 infinoc linea2 linea3 linea4
> linea5 linea6 linea7 cepaHA89:tiempo4 cepaRK416:tiempo4
> 1         0        1         0       0       0       1      0      1 0      0
> 0      0                0                 0
> 2         0        1         0       0       0       1      0      0 1      0
> 0      0                0                 0
> 3         0        1         0       0       0       1      0      0 0      1
> 0      0                0                 0
> 4         0        1         0       0       0       1      0      0 0      0
> 1      0                0                 0
> 5         1        0         0       0       0       1      0      0 0      0
> 0      1                0                 0
> 6         1        0         0       0       0       1      0      0 0      0
> 0      0                0                 0
>  cepaHA89:tiempo8 cepaRK416:tiempo8 cepaHA89:infinoc cepaRK416:infinoc
> tiempo4:infinoc tiempo8:infinoc cepaHA89:tiempo4:infinoc
> 1                0                 0                1                 0 0
> 0                        0
> 2                0                 0                1                 0 0
> 0                        0
> 3                0                 0                1                 0 0
> 0                        0
> 4                0                 0                1                 0 0
> 0                        0
> 5                0                 0                0                 0 0
> 0                        0
> 6                0                 0                0                 0 0
> 0                        0
>  cepaRK416:tiempo4:infinoc cepaHA89:tiempo8:infinoc
> cepaRK416:tiempo8:infinoc
> 1                         0                        0 0
> 2                         0                        0 0
> 3                         0                        0 0
> 4                         0                        0 0
> 5                         0                        0 0
> 6                         0                        0 0
>>
>
> dge7 <- estimateGLMCommonDisp(dge1, design7, verbose = TRUE)  #To estimate
> common dispersion
> #  Disp = 0.03873 , BCV = 0.1968
>
> dge7 <- estimateGLMTagwiseDisp(dge7, design7) #To estimate tagwise
> dispersions
> # $tagwise.dispersion
> # [1] 0.21290421 0.01218267 0.01580605 0.00906055 0.02124947
>
> fit7 <- glmFit(dge7, design7, dispersion=dge7$tagwise.dispersion)
>
>> class(fit7)
> [1] "DGEGLM"
> attr(,"package")
> [1] "edgeR"
>
>> sessionInfo()
> R version 3.1.0 (2014-04-10)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C LC_TIME=es_ES.UTF-8
> LC_COLLATE=en_US.UTF-8 LC_MONETARY=es_ES.UTF-8
> [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=es_ES.UTF-8       LC_NAME=C
> LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] edgeR_3.6.2  limma_3.20.4
>
> thanks a lot!
> hope is explained well....
>
> --------------------------------------------------------
> Maximo Rivarola
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:6}}



More information about the Bioconductor mailing list