[BioC] Limma and Genepix

Gordon K Smyth smyth at wehi.EDU.AU
Thu May 19 01:42:57 CEST 2005


> Date: Tue, 17 May 2005 07:45:43 -0400
> From: lepalmer at notes.cc.sunysb.edu
> Subject: [BioC] Limma and Genepix
> To: bioconductor at stat.math.ethz.ch
>
> Content-Type: text/plain
>
> This is the pipeline I have been currently using for analysis.  I just
> wanted peoples opinions on if things can be done better.   (Its a 3 sets
> of dye-swaps with 2 spots per orf per chip)
>
> library(limma)
> targets<-readTargets("targets.txt")
> RG<-read.maimages(targets$FileName,source="genepix",wt.fun=wtflags(0))
> RG$printer<-getLayout(RG$genes)
> RG$genes<-readGAL("Y_pestis.sorted.gal")
> spottypes<-readSpotTypes("spotTypes.txt")
> RG$genes$Status<-controlStatus(spottypes,RG)
> RGb<-backgroundCorrect(RG,method="normexp")
> MA<-normalizeWithinArrays(RGb)
> MA<-normalizeBetweenArrays(MA)
> cor<-duplicateCorrelation(MA,ndups=2,spacing=240)
> design<-c(1,-1,1,-1,-1,1)
> fit<-lmFit(MA,design,ndups=2,correlation=cor$consensus.correlation,spacing=240)
> fit<-eBayes(fit)
> tt<-topTable(fit,adjust="fdr",n=6000)
> write.table(tt,file="tmp.txt",sep="\t")
>
> I have also recently read about the Kooperberg method for background
> correction.  Is this a preferred method?
> I have been able to do this with the following commands
>
> targets<-readTargets("targets.txt")  #
> RG<-read.maimages(targets$FileName,source="genepix",wt.fun=wtflags(0))
> RG$printer<-getLayout(RG$genes)
> RG$genes<-readGAL("Y_pestis.sorted.gal")
> spottypes<-readSpotTypes("spotTypes.txt")
> RG$genes$Status<-controlStatus(spottypes,RG)
> read.series(targets$FileName, suffix=NULL, skip=31, sep="\t")
> RGb <- kooperberg(targets$FileName, layout=RG$printer)
> RGb$genes<-RG$genes
> RGb$printer<-RG$printer
> RGb$weights<-RG$weights
> RGb$targets<-RG$targets
> MA<-normalizeWithinArrays(RGb)
> MA<-normalizeBetweenArrays(MA)
> cor<-duplicateCorrelation(MA,ndups=2,spacing=240)
> design<-c(1,-1,1,-1,-1,1)
> fit<-lmFit(MA,design,ndups=2,correlation=cor$consensus.correlation,spacing=240)
> fit<-eBayes(fit)
> topTable(fit,adjust="fdr",n=32)
> tt<-topTable(fit,adjust="fdr",n=6000)
> write.table(tt,file="tmp.txt",sep="\t")
>
> I recently had a small argument with an advisor who told me to do
> background correction by subtracting background from foreground and
> flagging negative numbers.  This is obviously the default for limma.  BUt
> when doing this approach, a lot of spots popped up that didnt make sense
> (ie non-specific DNA), while the normexp fixed that problem.  I recently
> discovered Kooperberg, which was designed for the problem of negative
> intensitie with Genepix data.  So which is the best method, and how do I
> convince this guy that I should use this method?

I don't think anyone knows which is the best method, but normexp and koorperberg are clearly
better than subtracting, as you have observed.

> One last question I have is that these methods will give you some
> statistics on gene expression differences.  Often people report genes that
> are differentially regulated by more than two-fold.  It seems to me that
> to do this, one would need an intensity cutoff, as genes with little, or
> no expression can easily slip into that category. How would one calculate
> such a cutoff?

One of the beauties of using normexp or similar offset background and statistical criteria for
differential expression is that an intensity cutoff is not required.

Gordon

>  There are spots on the array that contain oligos that are
> definitely not found in the species being studied. (Bacteria vs
> arabidopsis).  Can this information be used.
>
> Thanks,
> Lance Palmer
> 	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list