[BioC] reasonable Illumina hyperG test
    Sebastien Gerega 
    seb at gerega.net
       
    Fri Sep  5 07:18:00 CEST 2008
    
    
  
Hi,
I have been looking around at examples of the hyperGTest (in the 
GOstats, lumi, and other documentation) and feel like I have seen many 
slight variations on the methodology.
These variations are usually found in the way the non-specific filtering 
is performed. I haven't come across many examples of a hyperGTest for 
KEGG pathways and would like to ask whether my approach seems reasonable 
or whether I should make any changes.
Here is my code ("sig" is a vector of EntrezID):
uni = exprs(lumi.N.P)
#Remove those without PATH annotation
havePATH = sapply(mget(allFeatures, lumiHumanAllPATH),
function(x){
    if (length(x) == 1 && is.na(x))
    FALSE
    else TRUE
})
uni <- uni[names(which(havePATH == TRUE)),]
#Remove those with little variation accross samples
iqrCutoff = 0.5
uni.IQR = apply(uni, 1, IQR)
uni = uni[which((uni.IQR > iqrCutoff) == TRUE),]
#Keep probes w/largest IQR
uni = uni[findLargest(rownames(uni), uni.IQR[rownames(uni)], 
"lumiHumanAll"),]
uni = mget(rownames(uni), lumiHumanAllENTREZID)
params = new("KEGGHyperGParams", geneIds=sig, universeGeneIds = uni, 
annotation="lumiHumanAll", pvalueCutoff=0.05, testDirection="over")
hgOver = hyperGTest(params)
Does this code/approach seem reasonable? Should I correct for multiple 
testing after the hyperGTest?
Would it be fair to perform a test on gene ontologies in teh same way 
(obviously after having changed the param type and specifying an 
ontology branch)?
thanks,
Sebastien
    
    
More information about the Bioconductor
mailing list