[BioC] Comparing Two Affymetrix Arrays Question

Thu Jun 23 22:38:01 CEST 2011

Hi Ramzi,

There's nothing "wrong" with your results - there is simply not 
enough statistical evidence for differential expression between Group 
A and Group B. Is there a reason you are not following the 
limmaUserGuide() section 8.6, which goes through how to fit one model 
on 3 groups and pull out the pairwise comparisons you want? You'll 
get more power to detect differences between Groups A and B by 
fitting one model with all 3 groups instead of fitting separate models.

Best,
Jenny

At 03:30 PM 6/23/2011, Abboud, Ramzi wrote:
>Hello All,
>
>I am somewhat new to bioconductor, and am trying to accomplish what 
>I believe is a simple task.  I have 15 AffyMetrix gene signatures 
>from 15 subjects, each in the form of a .cel file.  The subjects are 
>grouped into 3 groups of 5 - let's say Group A, Group B, and Wild 
>Type.  I would like to compare the average gene expression of 
>subjects in Group A to those in Group B, and separately compare the 
>average gene expression in Group A to Wild Type.  In the results I 
>would like the genes most significantly different in expression 
>levels and the p-values for each gene comparison.
>
>I have code which I believe does this, but the results do not seem 
>totally correct, and I would like some help.
>
>I have two very similar R scripts (to keep things separate and 
>simple).  One compares Group A to Group B.  The other compares Group 
>A to Wild Type.  The results from comparing Group A to Wild Type 
>look correct.  However, the results from comparing Group A to Group 
>B give an adjusted p value of 0.999992827573282 for every single 
>gene.  Here are the top two lines from the output file (columns are 
>ID, logFC, AveExpr, t, P.Value, adj.P.Val, B) :
>
>42970   1458675_at      -0.402322454    4.770279273     -4.522575441 
>    0.000900789     0.999992828     -3.268563539
>23121   1438815_at      0.437319401     7.866319701     4.013307606 
>    0.002098968     0.999992828     -3.417357338
>
>Obviously something is not right.  All the other numbers from the 
>Group A vs Group B comparison look reasonable, but this adjusted p 
>value is making me doubt the whole thing.
>
>Does someone see a glaring and obvious mistake in my code (which is 
>included below)?  Is there a better or simpler way to do comparison?
>
>Please let me know if I can provide any additional information.  I 
>would be happy to provide the excel
>
>Thank you,
>Ramzi
>
>The following code compares Group A with Group B.  It is my R 
>Script, with notes.
>
>## Load Packages
>library(affy)       # Affymetrix pre-processing
>library(limma)      # two-color pre-processing; differential expression
>library(Biobase)
>
>## Read targets file.
>pd <- 
>read.AnnotatedDataFrame("TargetsAvsB.txt",header=TRUE,row.names=1,as.is=TRUE)
>
>## Read .CEL data.
>rawAffyData <- ReadAffy(filenames=pData(pd)$FileName,phenoData=pd)
>
>## Normalize the data
>eset <- rma(rawAffyData)
>
>## The target file information can be recovered from the eset.
>targets <- pData(eset)
>
>## Define a design matrix.
>#designMatrix <- createDesignMatrix(eset)
>designMatrix <- model.matrix(~ -1 + 
>factor(targets$Target,levels=unique(targets$Target)))
>colnames(designMatrix) <- unique(targets$Target)
>numParameters <- ncol(designMatrix)
>parameterNames <- colnames(designMatrix)
>
>## Define a contrasts matrix.
>#contrastsMatrix <- createContrastMatrix(eset, design=designMatrix
>contrastNames <- c(paste(parameterNames[1],parameterNames[2],sep="_vs_"))
>contrastsMatrix <- matrix(c(1,-1),nrow=ncol(designMatrix))
>rownames(contrastsMatrix) <- parameterNames
>colnames(contrastsMatrix) <- contrastNames
>
>## Fit to linear model.
>fit <- lmFit(eset,design=designMatrix)
>
>## Empirical Bayes statistics
>fitNoContrastsMatrix <- eBayes(fit)
>
>## Fit a linear model for the contrasts.
>fit2  <- contrasts.fit(fit,contrasts=contrastsMatrix)
>
>## Empirical Bayes statistics
>fit2  <- eBayes(fit2)
>
>numGenes <- nrow(eset)
>completeTable_A_vs_B <- topTable(fit2,number=numGenes)
>write.table(completeTable_A_vs_B 
>,file="A_vs_B_genes.xls",sep="\t",quote=FALSE,col.names=NA)
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at r-project.org
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor