[BioC] ANOVA test after VST in DESeq

Liu, XiaoChuan xiaochuan.liu at mssm.edu
Thu Mar 14 16:57:29 CET 2013


Dear Simon,

Here are my code for ANOVA test after VST in DESeq, Do you think it is right to do ANOVA test by VST? Thanks!

Code:
CountTable <- read.table("all.samples.count.txt",header=TRUE, row.names=1)

library('DESeq')
Design<-data.frame(row.names=colnames(CountTable), condition = c("Wildtype","Wildtype","Knockout","Knockout","Wildtype","Wildtype","Knockout","Knockout"), treatment = c("Picrotoxin","Control","Picrotoxin","Control","Picrotoxin","Control","Picrotoxin","Control"))

cds <- newCountDataSet( CountTable, Design )
cds <- estimateSizeFactors( cds )
cds <- estimateDispersions( cds )

x<-getVarianceStabilizedData(cds)
CountTable_new <- x

nrow_number<-nrow(CountTable_new)
ncol_number<-ncol(CountTable_new)

all_P_value<-0;

for(i in 1:nrow_number){

t1<-as.matrix(Design$condition)
t2<-as.matrix(Design$treatment)
t3<-as.matrix(CountTable_new[i,])

anovaTable<-matrix(c(t1,t2,t3),nrow=ncol_number,ncol=3,dimnames = list(c(1:ncol_number),c("condition","treatment","expression")))

anovaTable_data<-as.data.frame(anovaTable)
anovaTable_data$expression<-as.integer(anovaTable_data$expression)
int <- aov(expression ~ condition*treatment, data= anovaTable_data)
int_matrix<-as.matrix(anova(int))
p_value=int_matrix[3,5]
all_P_value<-c(all_P_value,p_value)
}
all_P_value<-all_P_value[2:(nrow_number+1)]

all_P_value<-as.matrix(all_P_value)


anova<-matrix(c(CountTable_new,all_P_value,all_P_value),nrow=nrow_number,ncol=(ncol_number+2),dimnames = list(rownames(CountTable_new),c(colnames(CountTable_new),"p_value","q_value")))

anova<-as.data.frame(anova)
anova$q_value<-p.adjust(as.numeric(anova$p_value),method = "BH")

output_file_name="two_way_ANOVA.txt"
write.table( anova, file=output_file_name, quote= FALSE, sep="\t", row.names= TRUE )


Leo


>3.      In Part 3, I also do the overlap with 6 results in Part 2.
> But the overlap are very small. I wonder if I make a mistake to
> misuse the variance stabilizing transformation? If I want to directly
> use the ANOVA function in R to calculate co-factor P-value, could I
> use the raw count? Or How to normalize the raw counts then I can use
> ANOVA function in R?

No, ordinary-least-square (OLS) ANOVA requires data to be homoscedastic 
and this count data is not. This is, after all, the whole point of 
either using GLMs on the raw count data, or OLS ANOVA on 
variance-stabilized data.

I would have expected some similarity between the results of a GLM 
ANODEV anaysis of the count data and the OLS ANOVA analysis on the vST 
data, but as you did not post the code you used, it is hard to say 
whether you may have made a mistake.

   Simon

_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


More information about the Bioconductor mailing list