[BioC] Deciding on a cut off after QC

Gordon K Smyth smyth at wehi.EDU.AU
Mon May 16 14:21:03 CEST 2005


On Mon, May 16, 2005 9:49 pm, Ankit Pal said:
> Dear Dr Smyth,
> I'm sorry for not having specified which result file.
> It is the final result summary we get after we give
> the command
> Resultfile <- topTable(fit,n=200, adjust="fdr")
> A sample result file has been attached.
> The code I used for my analysis is
>
>> targets <- readTargets("target.txt")
>
> #The QC filter
>> myfun <- function(x,threshold=55){
> + okred <- abs(x[,"% > B635+2SD"]) > threshold
> + okgreen <- abs(x[,"% > B532+2SD"]) > threshold
> + okflag <- abs(x[,"Flags"]) > 0
> + okRGN <- abs(x[,"Rgn R²"]) > 0.6
> + as.numeric(okgreen || okred || okflag || okRGN)
> + }
> #end of QC filter
>
>> RG_7 <- read.maimages(targets$FileName,
> source="genepix",wt.fun=myfun)
>> RG_7$genes <- readGAL()

As I said last week, this command is not needed with GenePix data.  You should omit it.

>> RG_7$printer <- getLayout(RG_7$genes)
>> MA_7 <- normalizeWithinArrays(RG_7,method="loess")
>> MA_7 <- normalizeBetweenArrays(MA_7)
>> fit_7 <- lmFit(MA_7, design=c(1,-1,1,-1))
>> fit_7 <- eBayes(fit_7)
>> options(digits=3)
>> Resultfile_7 <- topTable(fit_7, n=39000,
> adjust="fdr")
>> Resdat_7 <-data.frame(Resultfile_7)

Resultfile_7 is already a data.frame.

>> write.table(Resdat_,file='Result.csv',quote = FALSE,
> sep = "\t")
>
> I understand that the spots that do not qualify the QC
> filter are given a weight of "0" by limma and are not
> considered for normalization and will not affect the
> analysis.
> The result file I get contains all the spots (38000)
> in my case.

Mmm, this doesn't make sense.  You have 4 arrays with 39000 spots on each array.  Hence you have
4*39000 spots, not 39000.  The output from topTable() gives a summarized log-ratio for each probe,
not a result for each spot.

Gordon

> Didn't the spots that were bad get removed from the
> final result?
> If not what is the cut off value (B, p etc) that I
> need to use to get a set of reliable spots(I cant use
> all the 38000) from my result file for my analysis.
> Is there a fixed formula to derive the same as the
> values vary with the analysis.
> Waiting for your reply,
> Thank you,
> -Ankit



More information about the Bioconductor mailing list