[BioC] (no subject)

Liu, XiaoChuan xiaochuan.liu at mssm.edu
Fri Mar 8 16:54:55 CET 2013


Dear Dr. Simon Anders,

I have some questions on the multi-factor test in DESeq.
Here I have total 8 RNA-seq samples. There are two factors: condition(Wildtype and Knockout), and treatment(Picrotoxin and Control). And each has two biological replicates. Details I have attached (see Table 1 in attachment).
Now I want to use the generalized linear models to do the ANOVA-like analysis for these two factors and want to find the differentially expressed genes caused by the two factors. The part 2 in attachment are the different GLMs tests by DESeq. And the cut-off are chosen as P-value<0.05 and FDR<0.05 separately. The number of significant differentially expressed genes are counted by cut-offs. Details see Part 2 in attachment.
I also pick up raw counts for all the genes, then I try to use the function getVarianceStabilizedData() to transfer the raw counts to variance stabilizing transformation (VST). After that, I directly use the ANOVA function in R to calculate the P-value for each genes based on the VST value. When I get all P-value for all genes, I also do the p.adjust to calculate the FDR by method = "BH". Results see Part 3 in attachment.

Here are my questions:

1.      What is the meaning when I use “count ~ 1”? Here 1 is a cut-off? Or other meaning? I saw you give an example like this in DESeq Reference Manual. So I try to follow using it. But I do not know the meaning for test.

2.      In Part 2, I did 6 different GLMs tests by DESeq. And I also do the overlap amongst results. The overlap sometimes is very small. Why are they so different? How to explain it? Could you give me some comments?

3.      In Part 3, I also do the overlap with 6 results in Part 2. But the overlap are very small. I wonder if I make a mistake to misuse the variance stabilizing transformation? If I want to directly use the ANOVA function in R to calculate co-factor P-value, could I use the raw count? Or How to normalize the raw counts then I can use ANOVA function in R?

Thank you very much!

Looking forward to your response!

Best,

Leo





1.      Table1
         condition      treatment
WP1     Wildtype       Picrotoxin
WC1     Wildtype       Control
KP1      Knockout       Picrotoxin
KC1      Knockout       Control
WP2     Wildtype       Picrotoxin
WC2     Wildtype       Control
KP2      Knockout       Picrotoxin
KC2      Knockout       Control



2.      Multi-factor test
fit1 = fitNbinomGLMs( cds, count ~ treatment + condition )
fit0 = fitNbinomGLMs( cds, count ~ treatment )
pvalsGLM1 = nbinomGLMTest( fit1, fit0 )
padjGLM1 = p.adjust( pvalsGLM1, method="BH" )

Dataset1
P-value<0.05: 615
FDR<0.05:    20




fit3 = fitNbinomGLMs( cds, count ~ condition + treatment )
fit2 = fitNbinomGLMs( cds, count ~ condition )
pvalsGLM2 = nbinomGLMTest( fit3, fit2 )
padjGLM2 = p.adjust( pvalsGLM2, method="BH" )

Dataset2
P-value<0.05: 1959
FDR<0.05:    870



fit1 = fitNbinomGLMs( cds, count ~ treatment + condition )
fit4 = fitNbinomGLMs( cds, count ~ 1 )
pvalsGLM3 = nbinomGLMTest( fit1, fit4 )
padjGLM3 = p.adjust( pvalsGLM3, method="BH" )

Dataset3
P-value<0.05: 1767
FDR<0.05:    778

fit5 = fitNbinomGLMs( cds, count ~ condition * treatment )
fit0 = fitNbinomGLMs( cds, count ~ treatment )
pvalsGLM4 = nbinomGLMTest( fit5, fit0 )
padjGLM4 = p.adjust( pvalsGLM4, method="BH" )

Dataset4
P-value<0.05: 393
FDR<0.05:    26



fit5 = fitNbinomGLMs( cds, count ~ condition * treatment )
fit2 = fitNbinomGLMs( cds, count ~ condition )
pvalsGLM5 = nbinomGLMTest( fit5, fit2 )
padjGLM5 = p.adjust( pvalsGLM5, method="BH" )

Dataset5
P-value<0.05: 1450
FDR<0.05:    665



fit5 = fitNbinomGLMs( cds, count ~ condition * treatment )
fit4 = fitNbinomGLMs( cds, count ~ 1 )
pvalsGLM6 = nbinomGLMTest( fit5, fit4 )
padjGLM6 = p.adjust( pvalsGLM6, method="BH" )

Dataset6
P-value<0.05: 1465
FDR<0.05:    657


Overlap(P-value<0.05)

Dataset1

Dataset2

Dataset3

Dataset4

Dataset5

Dataset6

Dataset1













Dataset2

240











Dataset3

455

1511









Dataset4

356

178

342







Dataset5

220

1423

1378

184





Dataset6

396

1282

1432

328

1247








Overlap(FDR<0.05)

Dataset1

Dataset2

Dataset3

Dataset4

Dataset5

Dataset6

Dataset1













Dataset2

10











Dataset3

20

720









Dataset4

18

15

26







Dataset5

9

658

652

15





Dataset6

20

618

653

26

602










3.      Two way ANOVA After VST:

P-value<0.05: 671
FDR<0.05:    0


Overlap(P-value<0.05)

Dataset1

Dataset2

Dataset3

Dataset4

Dataset5

Dataset6

Two way ANOVA After VST

35

77

68

34

69

69






-------------- next part --------------
A non-text attachment was scrubbed...
Name: Anova.pdf
Type: application/pdf
Size: 106447 bytes
Desc: Anova.pdf
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20130308/e5c68473/attachment.pdf>


More information about the Bioconductor mailing list